What is image processing in machine learning

Automatic stock analysis using image processing and machine learning

Transcript

1 Automatic stock analysis using image processing and machine learning Daniel Garten 1, Katharina Anding 2, Gerhard Linß 2, Peter Brückner 2 1) GFE - Society for Manufacturing Technology and Development Schmalkalden e.v. Näherstiller Straße 10, D Schmalkalden URL: 2) Technical University of Ilmenau Faculty of Mechanical Engineering, Quality Assurance Department Gustav-Kirchhoff-Platz 2, D Ilmenau URL: The project on which this lecture is based was funded by the Federal Ministry of Economics and Technology under the funding code 16INO496. The author is responsible for the content of the lecture.

2 Outline 1 Introduction 2 Image acquisition 3 Image features and classifier 4 Results achieved in practical use 5 Summary 2

3 1 Introduction Motivation Wheat as bread grain is a staple food in many countries, worldwide harvest 690 million tons in 2008 (source: FAOSTAT 1, some components of a wheat load have health-damaging effects (e.g. toxic foreign seeds, grains damaged by Fusarium, ergot) or can Damage processing machines (e.g. stones, metal parts). The composition of a wheat load decides on acceptance or rejection or price and purpose by the customer (mills and storage facilities). The components of a wheat load are determined by manually sorting a random sample of the underlying load (stock analysis according to EG-VO 824/2000 or ICC No. 102/1) => cost-intensive, subjective and error-prone automation of the stocking determination based on a representative sample of approx. 300 g wheat 1) Food and Agriculture Organization of the United Nations, Statistics Division 3

4 1 Introduction Comparison of manual and automatic analysis Manual stock analysis Automated stock analysis Sample size 50g 100g 250g 500g Duration of analysis 30min 45min 1 5min 10min Assessment subjectively objective Advantages through automation Objectification of the stock analysis 1) achieved in the project: 50g / minute Increase in food safety and the possibility of automated documentation time and cost savings for mills and grain stores practical procedure for a continuous stock analysis for every delivery 4

5 1 Introduction Working phases of technical recognition Visualization of characteristic features (camera, lighting, object feed) Numerical quantification of characteristic features 5

6 2 Image acquisition Comparison of two possible image acquisition principles Image acquisition of the sample components in free fall + very good object separation + high throughput (5000 objects in 7 min.) - Object speed not constant - No dimensionally accurate image - Illumination inhomogeneities / blurring due to different trajectories of the objects Image acquisition of the sample components on one Conveyor belt lying + true-to-shape image + constant object speed + more homogeneous lighting situation - poor object separation - slightly lower throughput - complicated isolation system required 6

7 Brief overview of the image features used Shape features Factor of the circularity Area of ​​the object region Height and width of the foreground region (bounding box dimensions) Moments of the edge contour Convexity Fitting of circle and ellipse to the edge contour Area of ​​the convex envelope Color and texture features Statistics 1st order of gray values ​​in HSI color space quantized histogram in the H, S and I channels Approach of levels to the gray value mountains in the H, S and I channel Features from the image filtered by Laws texture filters Local binary pattern Features from the gray value transition matrix (Co- occurrence matrix) Result: high-dimensional feature vector (D> 200) Question: Which features are really relevant for the recognition problem? 7th

8 Example of segmentation of the germ formation in the case of outgrowth damage Grain with outgrowth damage (starting region) Length of the region, which is the difference in quantity, serves as a feature parameter for classification 8

9 Procedure for feature selection 1. Random division of the existing data volume into 3 partitions (P1, P2, P3) Percentage data record division 33.33% 22.22% 44.44% DS1 Characteristic evaluation DS2 Training DS3 Test of recognition performance 9

10 Method for feature selection 1. Random division of the existing data volume into 3 partitions (P1, P2, P3) 2. Feature evaluation with a filter process, e.g. Information gain on P1 3. Ascending sorting of the features according to their calculated filter score 4. Iteration of the following sub-steps: - Removal of the m worst-rated features - n-fold cross-validation with the classifier to be used on P2, - Abort if the recognition rate is compared has decreased at the last iteration step, - further iteration if the recognition rate has not decreased compared to the last iteration step (optional significance test e.g. according to McNemar) 5. Final estimation of the recognition performance by training the classifier with the data from P1 and P2 with a subsequent test on the data from P3 10

11 Influence of the number of features on the recognition performance Recognition rate depending on the number of features for SVM 100.00% 90.00% 80.00% Recognition rate in% 70.00% 60.00% 50.00% 40.00% 30.00% 20 .00% 10.00% 0.00% Number of features Dependency of the recognition performance on the number of features 11

12 Monitored learning processes for technical recognition - Network structure (number of neurons in the input layer, number of hidden layers and the neurons contained therein) - Parameters (learning rate, activation function) Network adaptation is an optimization problem with a large number of variables - Kernel function with its parameters - Complexity parameters 12

13 Parameter optimization of the SVM used The support vector machine (SVM) can theoretically be used to solve the present detection problem (structural risk minimization, suitable for high-dimensional spaces with complex cluster distribution) Problem: choice of parameters (complexity parameters Nu, kernel parameters). Investigation of the influence of statistical test planning 13

14 Parameter optimization of the SVM findings used: 1. Strong influence of Gamma 2. Influence of Nu less than that of Gamma 3. If the value for Gamma is too high, the model becomes too complex and the generalization ability decreases (overfitting) => Simplified optimization strategy Select Nu and Gamma small and increase them gradually until the recognition performance decreases. (Use of 10-fold cross-validation) Dependency of the recognition performance on the parameters Nu and Gamma 14

15 Problem of image capture errors I Types of imaging errors that have occurred Blurring Object deformations due to rotation / tumbling motion in free fall Objects cut through due to line loss Over- or underexposure Touching / overlapping of individual objects in the camera image rarely occurs, less problematic very problematic Wheat Wheat Wheat Rape Misclassification falsify the analysis result of the class weed seeds weed seeds 15

16 Problem of image capture errors II Problem: Images with touching objects are created during image acquisition. The support vector machine does not offer an explicit rejection class. Pre-filters at the feature vector level (e.g. one-class SVM) are difficult to adapt due to the high intra-class variance paired with low inter-class variance. Algorithmic separation of the regions in the image provides high false-positive rates. Search for a suitable solution when using the SVM as a classifier 16

17 Approach to a solution in combination with SVM Problem of image capture errors III Segmentation of an image with three touching objects Learn about touching objects as a class of SVM. Conditions: - Image errors only rarely occur (here in the range up to a maximum of 0.2%) - Characteristics of the image acquisition error allow a classification with sufficient accuracy and a low false-positive rate to exclude a coherent image region S V M object from further analysis 17

18 The adaptive analysis program Grain Analyzer Functionality: - Analysis of 300 g of wheat (approx. Individual objects) in approx. 6 min. - Characteristic evaluation using filter methods Representation of the analysis results: - Exact number of objects in the classes to be distinguished, - Sample composition (bar chart) - Images of all objects in the memory - Numerical statistics of the analyzed sample - Average surface area of ​​the objects - Average longest extent of the objects 18

19 The adaptive analysis program Grain Analyzer Live image of the individual objects Training dialog Analysis result 19

20 4 Results achieved Validation by means of pre-sorted sample material that was not used for training on the developed demonstrator Under the influence of: - Fluctuations in lighting - Moisture of the wheat - Wheat variety - Growth conditions Estimation of the error for the individual object classes 20

21 5 Summary 1. Automation of the population analysis is possible. 2. Separation of good wheat and damaged wheat grains is very difficult. (strong variability in the phenotypic expression of the individual grains, strong variability in the defect expression) 3. Punctual defects are not always recognizable when viewed from one side. (2-camera solution extremely cost-intensive) 4. Color is the most important differentiation criterion between the individual object classes 5. Shape and texture are less relevant for differentiation. 6. Optimization of all steps of the pattern recognition chain as part of the QualiKorn project 7. Recognition rates of up to 95% in practical use 8. Successful two-week system test in two mills 21

22 Thank you for your attention! GFE - Society for Manufacturing Technology and Development Schmalkalden e.v. Measurement technology / test bench construction division Näherstiller Straße 10, D Schmalkalden Tel .: 03683 / URL: Technical University of Ilmenau Faculty of Mechanical Engineering, Department of Quality Assurance Gustav-Kirchhoff-Platz 2, D Ilmenau Tel .: 03677 / URL: Any questions? 22nd

23 Literature EG-VO No. 824/2000: Ordinance on the procedure and the conditions for the acceptance of grain by the intervention agencies as well as the analysis methods for the determination of the quality, 2000 K. Anding, D. Garten: Comparison of Different Classification Algorithms at the Application of Automatical Quality Assurance of Grain. 2008, 53rd International Scientific Colloquium Technical University of Ilmenau Anding, Katharina; Brückner, Peter; Dambon, Martin; Garten, Daniel: Measuring Wheat Quality. IN: Journal for Vision Systems Design, Volume 15, Issue 6, 2010 Garten, Daniel; Brückner, Peter; Linß, Gerhard: Image Acquisition and Image Features for the Automated Quality Assurance of Grain, In: Artificial Intelligence and Applications (AIA) 2010, Innsbruck, February 2010 M. A. Hall and G. Holmes: Benchmarking attribute selection techniques for discrete class data mining. IEEE Transactions on Knowledge and Data Engineering, 15 (6), 2003, S Linß, Gerhard: Quality management for engineers. Fachbuchverlag Leipzig, 2nd edition, 2005, pp. 417ff 23