Asbestos is a highly toxic silicate mineral that has been used widely in many products for its insulating, non-flammable and heat resistant properties. Exposition to high concentrations of asbestos may lead to chronic inflammation of the lungs and cancer. After the toxicity has become known, much effort was undertaken - and is up to this day - in removing asbestos from buildings, roofs and other materials used in industry and in the public. Asbestos detection is a manual, complex and time-intensive process, that requires an experienced expert in order to have consistent and correct results. In an attempt to reduce manual labor and increase consistency in detection, machine learning models have been recently adopted to automate the detection process.
Convolutional neural networks are a subset of deep learning algorithms that have been shown to be very effective for computer vision tasks, outperforming humans in many areas. They have the property of automatically extracting feature mappings from the provided dataset, which encode relevant spatial information and patterns. These properties make convolutional neural networks a very good fit for the task of detecting asbestos fibers. Especially by reducing the needed knowledge and experience acquired through years of working with asbestos which is one of the main prerequisites of a laboratory to be successful in the detection of asbestos fibers.
In this thesis, several state-of-the-art architectures are explained and compared against each other in regard to performance, size, and complexity. Since the provided dataset with its 2’000 images is rather small, techniques like transfer learning, data augmentation and dataset alterations are applied and evaluated extensively. Another branch of investigation contains different modifications to the current architectures in order to achieve better performance while decreasing overall complexity. Visualization techniques should allow a better understanding in what the models learn, increase user’s trust and support the reasoning of the findings.
There is no clear preference for any of the investigated architectures. They all performed in a similar range with Densenet121 achieving best results with an accuracy of 86% when trained from scratch. Transfer learning from ImageNet was beneficial in every single case leading to an improvement of roughly 7%. Visualizations show that this is due to mid- and high-level feature mappings transferred from ImageNet. The models trained from scratch fail to create more complex mid- and high-level filters from that few images. Data augmentation and cropping methods help to reduce the problem of overfitting and slightly increase the accuracy by 2-3%. Dataset alterations failed to consistently increase performance. So did the architecture modifications, that allowed bigger image sizes to be fed to the network. Reducing the overall number of parameters by over 99% did not harm the performance but reduced the complexity drastically making these models much faster, easier to interpret and more deployable.