Challenges for AI-Driven Software Development: Part 1
February 11, 2025

We are big fans of using AI in our work.
We use it a lot (see Benefits of AI for Software Development), but there are challenges for AI-driven software development!
In some ways using AI for software development is like any new tool or methodology, we can use various metrics such as error rates and testing cycles to evaluate the expected or claimed benefits. But there are major differences:
- The scale and breadth of benefits is much broader, covering every aspect of software development from testing efficiency to coding quality, standards and even analysis of the project itself in the context of every other project undertaken.
- AI goes beyond being just a tool or process for developers to follow. It can itself become the user of the tool or method itself and may even be in a position to evaluate the or method.
- An almost unlimited amount of data or knowledge is available to AI. Leaning is possible, and so the ‘tool’ or ‘methodology’ has the ability to continuously evolve and improve
In measuring AI success, we need to adopt an approach that keeps these differences in mind, and goes beyond the usual metrics and indicators. In some cases, our human intelligence may provide the augmentation rather than the other way round. I’ve seen some suggested practices (source: ChatGPT) for implementing AI in software development that recommend:
- Start Small and Scale Gradually: Implement AI tools in non-critical projects first to understand their limitations and strengths. At the Bridge, we’ve had (unsurprisingly) better results by using the the most appropriate AI tool for the job
- Focus on Training and Upskilling: Invest in training your development teams to work effectively with AI tools
- Monitor and Optimise Continuously: AI models need regular updates and monitoring to remain effective
- Ensure Ethical AI Practices: Address concerns related to data privacy, security, and bias in AI algorithms
Sensible, and to a large extent they apply to any new technology or methodology. But unique to AI, the single most common problem as reported across numerous case studies and industry analyses, is data quality and availability. This issue consistently emerges as a significant bottleneck for the successful implementation of AI projects. Let’s break down why this is the case:
Why Data Quality and Availability Are Major Challenges
- Dependence on Large Datasets:
- AI models, especially machine learning and deep learning algorithms, rely heavily on vast amounts of high-quality, well-labeled data to learn and make accurate predictions. However, many organisations struggle to collect, clean, and curate the data needed to train these models effectively
- Poor data quality—such as missing values, inconsistent formats, and unstructured inputs—leads to unreliable model outcomes. In fact, it’s estimated that data preparation tasks (cleaning, labeling, organizing) can consume up to 80% of the time spent on AI projects
- Bias and Incomplete Data:
- If the training data is biased or unrepresentative, AI models can produce skewed or unfair results. This is particularly problematic in sensitive applications like hiring, lending, or healthcare, where biased models can perpetuate existing inequalities
- Furthermore, many organisations face challenges with data availability due to privacy regulations (e.g., GDPR), especially when dealing with sensitive or personally identifiable information (PII)
- Data Silos and Accessibility:
In large organisations, data is often stored in silos across different departments, making it difficult to access and integrate. This fragmentation can hinder the AI model’s ability to analyse data holistically.
For example, financial institutions like banks may have data spread across legacy systems, making it challenging to consolidate and utilise for AI-driven insights - Labeling and Annotation Costs:
- Supervised machine learning models require labeled data, which often necessitates manual annotation. This process can be time-consuming and costly, especially when domain-specific expertise is required (e.g., medical imaging, legal documents).
- Dynamic Data Environments:
- Data is not static; it evolves over time. AI models trained on outdated data may perform poorly as new trends emerge. Continuous monitoring and updating of training datasets are necessary, but this adds to the complexity and cost of maintaining AI systems
Industry Examples
- Healthcare: AI models designed for diagnostic purposes often struggle due to the variability in medical data quality across different hospitals and regions
- Finance: AI initiatives have to overcome challenges related to legacy data systems and data security, especially when applying machine learning models for fraud detection
- Retail and E-commerce: Large retailers invest heavily in data infrastructure to optimize their AI algorithms for personalised recommendations and inventory management. Even so, maintaining the accuracy and relevance of data remains a persistent challenge
Conclusion
Without clean, reliable, and representative data, AI models cannot achieve the desired accuracy and effectiveness. This data dependency is why data quality and availability is the most frequently reported challenge in AI development. Addressing this issue often involves investing in better data infrastructure, adopting robust data governance practices, and leveraging automated data cleaning tools to ensure models are trained on high-quality datasets.
For organizations looking to implement AI, a significant portion of their effort will need to focus on solving data-related issues to ensure the success of their AI initiatives.
If you’d like to discuss your website or software development options with The Bridge, feel free to contact me on lawrence@thebridgedigital.com.au