Reconstruction of Charts from Images

1. Introduction

The universal framework for science and engineering is devoted to the development of Grandiose Projects. So it is not easy to catch it. Here I provide a simple typical sample of framework application. I hope that this sample will provide a better understanding of the Framework. Let us describe the sample. Suppose that we have a bitmap image of chart (See picture above). We would like to obtain numeric information from this chart. Moreover we would like to obtain math dependency from this chart. The bottom part of the above image contains digital information and approximation of it by polynomial.

2. Background

Reconstruction of charts from image is briefly considered in my article Digital Image Processing. However my friends asked me for more details and particular cases. Reconstruction with marking is considered in this article. Suppose we have image with the following chart:

The following text explains step by step reconstruction of the image.

2.1 Filtering

We can construct digital filter for obtaining its points. This filter would require a lot of intellectual work. It would be easy to mark the image with colored points as it is presented below: 

Origin of coordinates is marked by red color, top point is marked by green color, right point is marked by blue color. Points of chart are marked by aqua color. Then we use digital filters for definition of points. For example, filter of chart point is presented by the following expression:

Where r, g and b are intensities of red, green and blue color respectively. Parameters x and y are threshold values for filter. After filtration, we have chart points that are marked by black points.

Filtration algorithm is presented in the following picture:

The Source object is source image. The Points Filter object contains necessary math operations. The Points Filter object represents the result of filtration. The filtration result is presented below: 


The Points Selection object transforms filtered image into digital selections. The first selection is a selection of X - coordinates of black points, the second one is a selection of Y - coordinates. So we already have digital representation of points of chart: 


However the image should be scaled. Moreover the image can have skewing. The calibration is needed.

2.2 Calibration

Shift, scale and skew of image can be compensated by following expression:

where xphysical is physical 2D vector which corresponds to physical coordinates x and y. ximage coordinates of this point in image. Vectors a, b define shift of image and matrix M defines scaling and skewing. Definition of these parameters contains definition of coordinates of image points and math processing. These operations are considered below.

2.2.1 Definition of coordinates of points on image

Definition of coordinates of base points is presented by the following sample:

The Source object is the source image. The Zero Filter object contains filtration expressions of the image. The result of filtration is the Zero Image object. Result of filtration is the image with a single black pixel. We would like to define its coordinates. First of all, we transform this image to selections using Zero Selection component. This component transforms image into two selections. The first selection is the selection of X coordinates of black pixels. The second selection is the selection of Y coordinates of black pixels. The Zero Processor object performs definition of coordinates. Properties of this object are presented below: 


This component defines parameters a and b of Zero Regression. It compares Formula_1 and Formula_2 of Zero Regression object with X and Y selections of Zero Selection object. In result parameters a and b of Zero Regression are estimates of X and Y coordinates of pixel which corresponds to origin of coordinates. Similarly top and right point is being defined.

2.2.2 Math processing

The full picture of math processing is presented in the following picture:


Let us describe its components. The Scale object enables us to manually define physical values of border values of coordinates. Properties of this object are presented below:


where a=xmin, b=xmax, c=ymin, d=ymax. This data set and data obtained in 2.2.1 is quite enough for definition of vectors a, b define shift of image and matrix M. A set of squares contains necessary formulas. Using the above transformation, we can define physical values of chart point. The following picture contains chart and digital representation of these points:


2.3 Regression

So we have physical values of points. We would like to approximate them. We use regression for this purpose. I already wrote an article about regression. Here I will consider approximation of points by polynomial. The Regression 3 component is used for regression formula. Properties of this component are presented below:


Formula of this component is formula of polynomial of degree 3. The x is variable of this polynomial. Otherwise x is Formula_1 of ProcessedX project. Formula_1 calculates array of calibrated X coordinates of chart. Output of this formula is componentwise calculation of this polynomial. This component is used by Processor 3 component. Properties of this component are presented below:


These properties have the following meaning. Left pane means that parameters a, b, c, d of Regression 3 component should be defined. These parameters are polynomial coefficients. The 0 of middle and 0 right pane mean that Formula_1 of Regression 3 object should match to Y - selection of Scaled Points object. The (-1) means neglecting of formula or selection in regression component. So Processor 3 approximates ordinates of selection by polynomial. Approximation result is presented below:


Green curve is the result of interpolation by polynomial of degree 3. Red crosses are points of chart.

3. Automation Equipped Working Place for Image Reconstruction

The framework is too complicated. If the user would only like to reconstruct charts, then he does not need to learn complicated framework. So special application for such users is developed. The framework is used as computational resource. A screenshot of the application is presented below:

User marks points of image and then user obtains digital information and approximation. Since application information is obtained by mouse events, we do not need digital image processing. Our algorithm uses coordinates of clicks. The new algorithm is presented in the following picture:

This algorithm is very close to the algorithm described in part 2. But Points Selection is described in part 2 algorithm is selection obtained from image. In this Points Selection is apriori known selection.

This algorithm is saved to file and is used as an application resource:

Resource parameters are obtained from user interface. User mouse clicks the following form:

In result pixels of chart become aqua. Similarly if we change selection of combobox...

...then we can mark Top, Right and Origin point of chart image. Physical limits of X and Y user are entered to DataGridView.

Then this data set is used in the following way:

 // Virtual desktop
 desktop = new PureDesktopPeer();

 // Load desktop from resource

 // The "Points Selection" object. This object is series
 DataPerformer.Series s = desktop.GetObject("Points Selection") as DataPerformer.Series;

 // Clears series

 // Fills series by points obtained from points obtained from mouse clicks
 foreach (int[] x in marks)
    s.AddXY((double)x[0], (double)x[1]);

 // The "Scale" object. This object implements "IAlias" interface
 IAlias sc = desktop.GetObject("Scale") as IAlias;

 // Filling parameters of this object by values from DataGridView
 sc["a"] = scale[0, 0];
 sc["b"] = scale[0, 1];
 sc["c"] = scale[1, 0];
 sc["d"] = scale[1, 1];

 // Objects which contains information about Origin, Top and Right point of chart image
 string[] ss = new string[] { "Zero Regression", "Top Regression", "Right Regression" };
 string[] ab = new string[] { "a", "b" };
 for (int i = 0; i < ss.Length; i++)
    IAlias alias = desktop.GetObject(ss[i]) as IAlias;
    for (int j = 0; j < ab.Length; j++)
       // Filling values by values obtained by mouse click
       alias[ab[j]] = (double)coord[i, j];

So stored in resources virtual desktop used as computational resource. We have access to objects of virtual desktop and their properties. This application enables us approximate charts by polynomials. Approximation chart and cooeficients of polynomial are presented below:


Points of Interest

The way of understanding of framework is not easy. However the shortest way is not the best way. Difficulties should not stop my attempts. Moses used 40 years for training of his people. If explanation of my ideas requires even 40 years, then I should not stop my work.


Recently my friend asked me about chart reconstruction. I've answered that this problem is contained in my article devoted to digital image processing. He said that this article is very complicated. I find that chart reconstruction is a very popular theme and a good promotion of my ideas.


If this article would be popular, I will improve my soft devoted to reconstruction of charts. The following webpage is devoted to chart reconstruction. These pages will contain different methods of filtration and regression. Polynomial approximation is the simplest method of regression.