The AI Project Blueprint

An overview of the steps involved in taking an AI solution from paper to production.

AI is not just exciting, it's transformative. Turning an idea into an impactful product that revolutionizes your business operations is a thrilling and attainable undertaking. You’re likely familiar with the latest tools and web apps that show off what AI is capable of, but how can put the underlying tools to use in your business, and once you’ve found a suitable opportunity, how do you bring it to life? In this article I will lay out a high-level overview of the AI design and development process. Each individual section warrants a comprehensive deep-dive, and I plan to write separate articles for each. However, the purpose of this article is to outline a high-level framework for AI platform design and development. After reading this article, you'll know all about how an AI project works, from the initial idea to a successful launch. You'll learn about each important step and what to expect. Let’s jump right in.

Step 1: Use Case Definition

Defining the use case for your AI solution is the most important step in ensuring a focused development process. In this step we strategically define what we are seeking to accomplish with this new technology, and define the precise tasks we need the AI perform.

A key distinction that will help in later steps is the distinction between an AI solution that needs to do a broad array of tasks, and a tool that needs to be good at one specific task. The implications of this decision will be touched on in the next section, but defining the tasks your AI solution needs to perform is vital.

It is during this initial phase that you also establish performance metrics that will be used to measure the effectiveness of your AI solution. These should be business-centered metrics that you are seeking to improve, and by how much. Define metrics such as enhancing overall report throughput by 20%, boosting employee productivity by 15%, and cutting the rate of shipped defects by 30%. These example metrics are simplified, but they represent the kind of measurable outcomes you should strive to define. It is important that these metrics be as specific as possible since they will help drive the next step: technology selection.

You will know this phase is complete when you have a defined set of goals and intelligent behaviors you need your AI model to perform, along with specific, measurable metrics and objectives you are aiming to achieve with your new AI solution.

Step 2: Technology Definition and Selection

After defining what we need our AI solution to do and the metrics by which we will measure its success, we are now ready to identify the specific technologies we will use.

Whether we need a model to perform a broad scope of related tasks or one very specific task will help us determine which AI models we use, and the extent to which training is required in our endeavors. Generalized models come in many varieties, and are ideal for general tasks. By fine-tuning these models they can be made even more effective for the specific tasks they are assigned, compared to using them in their original, unmodified form.

At the other end of the spectrum are objectives that require an AI model to be very good at a specific task. These kinds of results are attainable through thorough training with specific datasets (typically proprietary datasets that your business owns), and fine-tuning processes that aim to hone the precision of a model.

It is not recommended to train a model for two specific but different tasks. Attempting to do so typically leads to models that underperform on both tasks compared to creating separate models for each tasks. In the same way that you can’t expect the same level of expertise studying for two disciplines simultaneously compared to one, the same is true for model training.

Now it is time to identify and analyze the data to be used in model training. Data truly is king when it comes to the robustness of AI solution development. Data is education for a model. The model ingests the data you provide it, and learns from it. The higher quality and quantity data you have access to, the better you can expect your model to perform.

For any custom AI development project it is important to examine the data you have at your disposal. This could be things like business reports, financial data, performance metrics, etc. This tends to be data that has been collected over the years that your business leverage through AI. The quantity and quality of the data you have access to significantly influences the performance of the AI solution you are aiming to build, and the type of data you have access to is closely linked to the algorithms and models you can use to build your solutions.

You will know this phase is complete when you can list the specific AI technologies you are going to use, and when you have aggregated the data you will be using during training and fine-tuning.

Step 3: Development Plan and Error Mitigation

We are nearing the development phase, but we first need to establish a few things. The first is the development cadence or methodology. This is important as you need to define a development process that is adaptable and allows for the routine feedback necessary to refine a model’s performance to fit your needs. Agile is a common methodology used throughout the software industry as it provides both rapid feedback and is highly adaptable.

Some reasons that might require adjustments along the way are identification of new data sources that increase accuracy or lower computational costs, testing output indicates necessary changes in training process, or new technologies become available. An adaptable development methodology like Agile is well suited for changes like this, but Agile’s fit may vary depending on your circumstance and project.

It is also critical to analyze the impact that errors in AI output may have to your business. Depending on the intended use of your AI product, the impact of incorrect model output could be trivial or devastating. By walking through various use-cases and workflows you will create a broad understanding of the potential impact these errors can cause. From here you will be able to define remediation steps that minimize these consequences.

The source of these AI errors can be brought back to a few fundamental AI vulnerabilities: model bias, unintended behavior and hallucinations, and lack of explainability. I will link to deep-dive articles on each of those vulnerabilities, but the truth of the matter is that all models are susceptible to these problems. This list is not exhaustive, and these topics are the subject of research efforts underway at some of the largest players in the AI industry and academia. While improvements are being made, these vulnerabilities, and their impact, still need to be considered on a case-by-case basis.

Step 4: Develop, Adapt, Align Model

Now that we have defined objectives, metrics, technologies, and a development plan for our AI solution, it’s time to execute. The training and development phase significantly exceeds all other stages in both duration and collective effort, making up the majority of the AI solution design and development process.

This stage involves iteratively training, testing, and adjusting the model until desired performance is achieved. During this process bugs will need to be fixed, data will need to be adjusted, and test user feedback will be required.

Once the core model is trained and integrated into a user-friendly interface it is important to have routine deployments for test users to provide feedback. At the end of the day, your solution will fail if your intended users don’t enjoy using it. Because of this, representative target users should be involved from the start of the project to offer insight where appropriate, and even more crucially, be given ample time to test the product and provide feedback.

You will know this phase is complete when you have a model that meets your defined performance metrics, and the model has been integrated into a solution that is accessible by your intended users.

Following the successful launch of your AI solution the focus shifts to an iterative cycle of monitoring, model improvement, and user experience optimization. Collecting quantitative data and usage statistics is a practice that provides invaluable insight into performance and usability, and will shine a light on critical areas of improvement.

The model, or models, at the center of your AI solution are likely to need re-training or additional fine-tuning over time. AI models have a tendency to become less affective overtime due to changes in user behavior, or the environment’s in which they operate. Additionally, models may simply become obsolete as industry advancements are made.

Updating a model through any one of these methods should be done with the same iterative approach outlined above whereby testing and performance analysis are used as guides to ensure a solution that achieves the outcomes you are aiming for.

Ryan Mord