What's Up, Doc?
Documentation is normally what I'd think about when I'm stuck. If I don't know how to do something in Azure or I'm not sure how a function works in a particular library...I go to the docs. I'm pretty sure I'm not alone in that. Often documentation is a thing that we consume to help us, there's a love-hate relationship with them - on one hand they have the information needed to help, on the other they can be really quite dull. But good documentation isn't just something we should be consuming, we should be writing it too!
Why Should You Care About Writing Docs?
If I think about the part I play in MLOps for the wider Data Science team, we are helping to set standards and best practices for consistency across the spectrum of projects. We want to save Data Scientists time on the required processes that surround their models, they shouldn't be tied up in those things.
One of the key reasons we have MLOps is to make people's lives easier getting from model to deployment and beyond. Not only do the processes need to be in place but people need to understand them and know how to do them/set them up. So, how can we efficiently and effectively record what the standards are and provide the guides and explanations for this? Well, documentation of course!
Documentation is the other side of the coin to the processes needed for MLOps maturity (i.e. how well your MLOps processes are established and implemented). You need both parts to succeed. That's not to say documentation is only for people in MLOps, mind. Models and libraries need good docs too so that they can be used and understood by other people!
What Should We Be Documenting?
Thankfully, not everything. First off, there's no point reinventing the wheel...you don't need to write documentation that already exists. For instance, if you're using tools that have their own documentation, then you don't need to write documentation for those tools but you might well need documentation for how to use those tools in the desired context. It's a slight nuance there, tools will (or at least should!) have their own comprehensive documentation for the nuances of their tools. But as with most things in life, the same tool can be used for different purposes and that is exactly where your documentation comes in.
The documentation should be for showing clearly how the tools you have chosen will be used for the specific context. For instance, there are several ways of running an automated pipeline, but they might vary based on what the pipeline is for and what tools you choose. Say you're using Microsoft Azure to automate pipelines, there are maybe three ways (and probably more) to do this: DataBricks, the Azure SDK or Azure DevOps. Microsoft has got detailed docs on these tools. We would need to be writing internal documentation to show which tool(s) to use, in what way and how this works for the context of the Data Science project set up agreed. (Side note: this is also a reason why standardising practices across teams can be helpful!)
Where Does the Documentation Go?
The answer to this question is pretty specific to your company/project as it depends on the tools that they have access/subscribed to. The important part is that everyone who needs to have access should have access, there's no point writing this documentation and saving it locally for only you to see or for you to have to share it when people ask. It needs to be available as and when people need it. This indicates somewhere on the cloud, usually.
As usual, the tech world provides for tech needs. There are some helpful platforms set up exactly for writing and sharing documentation. To name just a couple that I've had a bit of experience with: GitHub Pages, great for creating documentation for associated GitHub repos. Another option might be using Confluence, which is part of the Atlassian suite - a shared document platform which allows you to share with specific parties. But, there are so many out there, some entirely browser bases, others have desktop apps. A quick search will bring up many options.
How Do I Write Good Documentation?
As much as it might sound simple, writing good technical documentation is a skill. It's something I struggle with because my writing style is generally more chatty. The goal of documentation is to get the point across clearly and succinctly, it needs to be assertive and clear. Think not like you're writing a blog, and more like you're writing instructions with the very necessary context to make them make sense.
Thankfully, again, the tech world provides. I have been going through the *free* technical writing training provided by Google. Some things might feel a bit basic if English is your natural language choice but it's helpful to have a reminder. The great thing about self-paced learning is that you can pick and choose what you spend the time on.
As with any skill, it takes time to get really good at it and it may never be perfected (largely because there are too many preferences out there to please everyone!) Making a start is the first step, ask the people around you to review it - both those you hope will use it and those with more experience writing docs. In time you'll get there, I'm right there with you on this journey!