Detecting and correcting 3D printing errors on the fly

Nature Communications


August 15, 2022

Material extrusion is one of the most common additive manufacturing (AM) methods for reasons including its relatively low-cost, little post-processing, compatibility with many materials and multi-material capabilities. These have made extrusion AM promising in numerous areas including healthcare, medical devices, aerospace, and robotics (its also why we like it so much at Matta). However, a key reason why many of these applications sadly remain at the research stage is that extrusion AM is vulnerable to diverse errors. These range from small-scale dimensional inaccuracies and mechanical weaknesses to total build failures. To counteract errors, a skilled worker typically must observe the AM process, recognise an error, stop the print, remove the part, and then appropriately adjust the parameters for a new part. If a new material or printer is used, this process takes more time as the worker gains experience with the new setup. Even then, errors may be missed, especially if the worker is not continuously observing each process. We thought that this was silly, and knew AI could help solve this problem.

In line with Matta's grant mission to build AI for manufacturing the impossible, the possible must be manufactured well at first (and this can still be done with AI). We worked with the Institute for Manufacturing in the Department of Engineering at the University of Cambridge to explore how AI can help detect and correct errors in the 3D printing process - hopefully demonstrating the potential that machine learning can bring to traditional manufacturing and control problems which remain unsolved.

In a paper published in Nature Communications, we introduce an easily deployable method using inexpensive webcams and a single multi-head deep convolutional neural network to augment any extrusion-based 3D printer with error detection, correction, and parameter discovery for new materials. This has been realised in this work through the development of CAXTON: the collaborative autonomous extrusion network, which connects and controls learning 3D printers, allowing fleet data collection and collaborative end-to-end learning. Each printer in the network can continuously print and collect data, aided by a part removal system. Unlike existing deep learning AM monitoring work, which often uses the human labelling of errors to train algorithms, CAXTON automatically labels errors in terms of deviation from optimal printing parameters. Uniquely, CAXTON thus knows not just how to identify but also how to correct diverse errors because, for each image, it knows how far printing parameters are from their optimal values. This autonomous generation of training data enables the creation of larger and more diverse datasets resulting in better accuracies and generalisability. The final system is able to detect and correct multiple parameters simultaneously and in real-time. The multi-head neural network can self-learn the interplay between manufacturing parameters due to the single shared feature extraction backbone, even making the system capable of recognising multiple solutions to solve the same error.


Dataset generation process

We generated a new 3D printing dataset of containing 1.2 million images of parts printed using polylactic acid (PLA), labelled with their associated printing parameters, for a wide range of geometries and colours using FFF 3D printers. Our data generation pipeline automated the entire process from STL file selection to toolpath planning, data collection and storage. Model geometries were automatically downloaded from the online repository, Thingiverse. The geometries were subsequently sliced with randomly sampled settings. During printing, images were captured every 0.4 seconds. Each captured image was timestamped and labelled with the current printing parameters: actual and target temperatures for the hotend and bed, flow rate, lateral speed, and Z offset. After 150 images had been collected, a new combination of printing parameters was generated for every printer by sampling uniform distributions of each parameter. The new parameter combinations were sent to each printer over the network as G-code commands which were subsequently executed. Upon execution, another 150 labelled images were gathered before the parameter update process happened again. This automated labelling procedure for each image provided greater resolution than human-based labelling because no human operator could label parameters with the same level of accuracy. and no human could label an image with an exact combination of multiple printing parameters as they strongly interact with each other.

Model architecture, training and performance

The accurate prediction of current printing parameters in the extrusion process from an input image is achieved using a multi-head deep residual attention networkwith a single backbone and four output heads, one for each parameter. In deep learning, single-label classification is very common and requires only a single output head to classify the input as one of N possible classes. However, this work requires multi-label classification to classify the input as one of three possible classes (low, good, and high) for each of the four labels (flow rate, lateral speed, Z offset and hotend temperature). To achieve this multiple output heads are used with a shared backbone for feature extraction. The weights of the shared backbone are updated during the backward pass in training by a sum of the losses from each of the separate output heads. This allows the backbone to learn its own interpretation of the relationships between each of the parameters and the importance of certain features shared across the parameters.

Fig. 1: Multi-head residual attention network architecture, performance, and visualisations for human interpretation. (A) The multi-head network architecture consist of a single shared Attention-56 network backbone, followed by four separate fully connected output heads after the flattening layer, one for each parameter. (B) Example attention masks at each module

The use of attention in the network may reduce the number of network parameters needed to achieve the same performance for our application, whilst making the network more robust to noisy labels. The attention maps may also aid in inspecting errors and explaining predictions. The single backbone allows for feature extraction to be shared for each parameter and, as such reduces inference time compared to having separate networks. Additionally, it allows the single network to model the interplay between different parameters. Each head has three output neurons for classifying a parameter as low, good, or high. With this structure, the network predicts the state of the flow rate, lateral speed, Z offset, and hotend temperature simultaneously in one forward pass from a single RGB input image.

To visualise which features the network is focussing on at each stage, images of the attention maps after each module were created. Here, the same attention mask from each module is applied to each of the 3 input images with the areas not of interest darkened (note: these masks are illustrative examples as each module contains many different attention maps). The network appears to focus on the printed regions in the example mask output for attention module 1, and then only on the most recent extrusion for module 2. Module 3 applies the inverse to the previous, focusing on everything but the nozzle tip.

Fig. 2: Machine vision control system pipeline.

Online correction and parameter discovery pipeline

To test the ability of the network to correct printing errors and discover optimal parameters for new materials, random 3D models were again downloaded, but this time for testing correction. During the printing process, images of the nozzle tip and material deposition were taken and automatically cropped to a square region focused on the nozzle tip. Next, our deep multi-head neural network produced a prediction (too high, too low, good) for each parameter given this image as input. These predicted parameters were then stored in separate lists of different set lengths for each parameter. If a particular prediction was made frequently enough that it made up a proportion of a full list greater than or equal to a threshold, then a final prediction was found and accepted. If not, then no updates were applied. If a prediction was found to be ‘too high’ or ‘too low’, the proportion of the list length constituted by the mode value was used to scale the adjustment to the parameter facilitating proportional correction. Once the final update amounts have been calculated for the printing parameters, they are sent to each printer for real-time execution.

Fig. 3: Learning to print unseen materials (ABS-X)

To demonstrate the system’s correction capability, an experimentation pipeline was constructed to take an input STL file, slice it with good print settings, insert a G-code command to alter a parameter to a poor value and then parse the generated G-code and split the model into 1 mm sections. The same model of the printer was used as in training but with an altered camera position, a new 0.4 mm nozzle with different external geometry, and an unseen single layer printing sample. These single-layer prints are used as a clearly interpretable benchmark to test each of the individual parameters and combinations of parameters across different printers, setups, and materials. The flow rate, Z offset, and hotend temperature parameter defects are clearly visible, while the lateral speed defect can be observed as a darker line where print speed was slowed. Despite being trained only using extruded thermoplastic PLA, the control pipeline generalise to diverse materials, colours, and setups. We showed online correction four different thermoplastics printed with different combinations of random multiple incorrect printing parameters on similar interpretable single layer benchmarks. In each case, the network successfully updated multiple parameters resulting in good extrusion.

Fig. 4: Printer and feedstock agnostic online parameter correction and discovery.

We applied the control pipeline using the same printer model as used in training on an unseen chess rook geometry to demonstrate that our methodology could be used in a production setting for full 3D geometries. Multiple random incorrect printing parameters were introduced halfway through printing, specifically a very high flow rate, lateral speed and hotend temperature and a low Z offset. The rook printed without correction dramatically failed, whereas the rook printed with the same conditions with correction enabled was completed successfully. We also printed six copies of the same 3D spanner geometry, each starting with the same combination of incorrect printing parameters: low flow rate and lateral speed, high Z offset and good hotend temperature. Of the six spanners, three were printed without correction resulting in one complete failure due to detachment from the print bed and a very poor surface finish on the remaining two. These errors are due to the poor initial layer caused by the suboptimal printing parameters. The three printed with correction were all completed successfully and exhibit the same improved surface finish, particularly on the initial layer. It should be noted that these corrected prints do not match a perfectly printed part. Imperfections are present until all the necessary corrections have been applied, and as such, some of the initial layer is printed with poor starting parameters.

Fig. 5: Correcting an artificially-generated flow rate error in real-time

To demonstrate the system’s generality, a different camera and lens were attached to a new location on an unseen printer (Lulzbot Taz 6) with a differently shaped nozzle and nozzle width — 0.6 mm instead of 0.4 mm as used in training. This printer uses an extrusion system which takes 2.85 mm diameter filament as input over 1.75 mm as used in the training printers. The same control system was applied to an unseen bishop geometry. Random incorrect printing parameters were introduced early on in the print, specifically during layer 7. These incorrect parameters were a low lateral speed and high flow rate, Z offset and hotend temperature. The erroneous bishop printed without correction failed, whereas the bishop printed with the exact same conditions with the control pipeline enabled was completed successfully with greater detail. Single-layer benchmark prints were completed with each individual erroneous parameter introduced using white PLA. These demonstrate that the multi-head neural network and control pipeline generalise to correct parameters across fused deposition modelling printers.

The control pipeline was further tested on a direct ink writing setup using a stepper motor with a threaded rod to move a plunger in a syringe. This used a different camera model and lens mounted at a different angle and distance from the nozzle with a transparent and reflective glass print bed instead of the black bed used during the thermoplastic tests. With this setup, PDMS, mayonnaise and ketchup were printed using a variety of nozzles – 0.21 mm for the PDMS and 0.84 mm for the condiments  showed that for PDMS, the network learn to increase flow rate by raising the pressure applied to the syringe. Once the required pressure was reached, the network reduced the flow rate to stop over extrusion. However, during long prints, the flow rate sometimes overshot due to a large build of pressure in the syringe, especially when the network did not reduce the flow rate fast enough. Balancing this pressure was especially challenging in this specific setup due to the viscous material and small nozzle diameter requiring high pressures for printing, creating a time gap between plunger movement and extrusion. When printing less viscous materials this overshoot and pressure delay were less of a problem, especially with larger nozzle diameters. For the mayonnaise and ketchup examples, the network mostly adjusted the flow rate and Z offset. We found both condiments tended to over extrude, and the network often reduced the flow rate and, for the first layer, lowered the Z offset. When printing multi-layered structures, the network tended to raise the Z offset at each layer and reduce the flow rate to stop the nozzle tip from being submerged in the previous layer.

Gradient-based visual explanations of network predictions

It is helpful to seek possible explanations for why models make certain decisions, particularly when deploying deep neural networks in production for safety-critical applications. Two popular visualisation methods that may help users gain some understanding of why neural networks make their predictions are guided backropagationand Gradient-weighted Class Activation Mapping (GradCAM). The former helps to show finer resolution features learned by the network in making predictions and the latter provides a coarser localisation showing important regions in the image (this can be thought of as post hoc attention). For both approaches, the target category (low, good, and high) for each of the four parameters is provided to determine which features or regions are specifically important for that category. On top of this, a method was developed to apply the techniques for each parameter separately within the whole network allowing us to produce up to 12 tailored visualisations for an input image (the three classes for each of the four parameters, e.g. low flow rate, high lateral speed, good Z offset).

Fig. 6: Visual explanations using separate saliency maps for each parameter may assist in verifying the robustness of the network.

Multiple combinations of erroneous parameters can result in either separated paths of extruded material (under extrusion) or overlapping paths of material (over extrusion). Guided backpropagation was used to try to determine if the network uses similar features across parameters to detect these physical extrusion properties. Representative example images for under, good, and over extrusion caused by different parameters were analysed. It appeared that similar features were shared between parameters for the same extrusion classification: separated paths for under extrusion, an outline of the current path for good extrusion, and around the nozzle for over extrusion.

GradCAM was applied to every layer of the shared network backbone for each of the parameters separately. We showed visualisations from the first and last layers (residual blocks 1 and 6, respectively). Earlier stages in the network appear to detect large structural features in the image, such as differentiating between the deposited material and the print bed. By the last layer, the network predominantly focused on the most recent extrusion from the nozzle irrespective of parameter or target class. This was desired as for fast response times and corrections, we want the network to use information from the most recently deposited material for its prediction. Example visualisations were also presented from direct ink writing tests. These images demonstrated that the trained network can use similar features at each stage during prediction as it uses for thermoplastic predictions.


We demonstrated that training a multi-head neural network using images labelled in terms of deviation from optimal printing parameters enables robust and generalisable, real-time extrusion AM error detection and rapid correction. The automation of both data acquisition and labelling allows the generation of a training image-based dataset sufficiently large and diverse to enable error detection and correction that generalises across realistic 2D and 3D geometries, materials, printers, toolpaths, and even extrusion methods. The deep multi-head neural network was able to simultaneously predict with high accuracy the four key printing parameters: flow rate, lateral speed, Z offset and hotend temperature from images of the nozzle during printing. It was found that this additional context and knowledge of multiple parameters even may lead to an improvement in predicting individual parameters—though further research to support this finding is needed. Like a human, the system was able to creatively propose multiple solutions to an error and could even discover new parameter combinations and learn how to print new materials. Unlike humans, though, the system operated continuously and made corrections instantaneously. Alongside this network, we present numerous advances in the feedback control loop with new additions such as proportional parameter updates, toolpath splitting, and optimised prediction thresholding, which combined provide an order of magnitude improvement in correction speed and response time compared to previous work.


Douglas A. J. Brion

Sebastian W. Pattinson


This work was been funded by the Engineering and Physical Sciences Research Council, UK PhD. Studentship EP/N509620/1 to D.A.J.B., Royal Society award RGS/R2/192433 to S.W.P., Academy of Medical Sciences award SBF005/1014 to S.W.P., Engineering and Physical Sciences Research Council award EP/V062123/1 to S.W.P. and an Isaac Newton Trust award to S.W.P.