What is a basic truth picture

Deep learning based ghost imaging


  • Computer science
  • Imaging and Perception
  • An author's correction to this article was published on April 17, 2018
  • This article has been updated


In this manuscript, we propose a new framework for computational ghosting, that is, ghosting using deep learning (GIDL). With a series of images reconstructed using traditional GI and their corresponding fundamental truth counterparts, a deep neural network has been trained to learn the acquisition model and improve image reconstruction. In addition, detailed comparisons between the image reconstructed using deep learning and compressive sensing show that the proposed GIDL performs much better at an extremely low sample rate. Numerical simulations and optical experiments were carried out to demonstrate the proposed GIDL.


Ghost imaging was first seen as a manifesto of quantum entanglement 1 demonstrated when a biphoton source was used. Soon after, it was shown that the quantum well is unnecessary 2 . Despite the debate over the physics, the GI was further demonstrated using pseudothermal light generated by dynamically modulating the illuminating laser beam with a spatial light modulator (SLM) 3 was generated . Although the source changes, most of the time the final image is reconstructed using the correlation of signals from the image arm and the reference arm. The reference arm can no longer be physically present because its function can be calculated using the knoledge of the random phase pattern displayed on the SLM. This technology is therefore also called Computational Ghost Imaging (CGI) 3 designated . CGI was used to study lensless imaging 4, X-ray imaging 5, 6 in poor lighting conditions 7 and in harsh environments 8 used . However, the need for a large number of measurements is one of the main problems facing practical applications hold, 9, 10, 11, 12 . Much effort has been made to reduce the sampling rate. For example, were non-computational 13, 14, 15 and proposed computational methods to improve image quality at low sample rate 9, 10, 11, 16, 17, 18, 19, 20 to increase . In particular, model the compressive Scanning GI (CSGI) 10, 16, 17, 18, 19 and the iterative GI 11, 20 the problem of image reconstruction in GI as an optimization problem.

In this letter, we propose a new CGI framework for high quality image reconstruction in low sampling conditions. The proposed method uses deep learning (DL) and is therefore known as ghost imaging using deep learning (GIDL). DL is a machine learning technique for data modeling and decision making with a neural network that is made up of a large amount of data 21, 22 is trained. The application of machine learning techniques in optical imaging was first proposed by Horisaki et al. 23 Users who have used Support Vector Degression (SVG) architecture to learn about scatterer. Over the past two years, we have seen the rapid development of the application of deep learning in solving various inverse problems in optical imaging. For example, it has been used in fluorescence lifetime imaging ( 24- Phase imaging) ( 25, 26) and in imaging through Scatter Media (27, 28) is used. By combining GI and DL, we show in this manuscript that GIDL can also significantly reduce the number of measurements as CSGI, but not with much better reconstruction. Detailed comparisons between performance, including image quality and noise immunity, are also discussed by CSGI and GIDL. Our analysis suggests that the GIDL promises great potential in applications such as imaging and sensor technology in harsh environments.

Numerical simulation

In ghost imaging, the unknown object T (x) is represented by a sequence of speckle patterns I. m (x) illuminated, where the subscript is (m = 1)


M) denotes the m- te Lighting. Then for the m- th Spot the signal collected by a bucket detector can be written as \ ({S} _ {m} = \ int {I} _ {m} (x) T (x) {\ rm {d}} x \). Traditionally, the image reconstructed using GI is obtained by correlating the signal fluctuation δS m with the speckle patterns δI m (x) received.

$$ O (x) = \ langle \ delta {S} _ {m} \ delta {I} _ {m} \ rangle \ mathrm {.} $$


In CGI, the spot intensities I m (x) calculated numerically from the phase patterns displayed on the SLM.

It has been shown that the signal-to-noise ratio (SNR) of the image reconstructed in this way is proportional to the measurement ratio, ie the ratio between the number of illumination patterns M and the (average) number of spots in each image from these patterns N. spec 9, 11, namely β = M / N spec . To show how it works, let's take the images shown in Fig. 1 (a) (digits '0', '3', '5' and '6') as examples for our simulation study. These basic truth images have 32 × 32 pixels. Using the formula given by Eq. (1) The images can be reconstructed as shown in Fig. 1 (b). The reconstructed images placed in the columns correspond to the sampling ratio β = 1, 0, 4 or 0.1. The results clearly suggest that the reconstructed images deteriorate significantly when the ratio β decreases from 1 to 0.1. The digits are clearly seen when β = 1, although there is noise. However, they are completely corrupted by noise when β = 0.1.

Simulation results from GI, GICS and GIDL. ( a ) Top row: basic truth objects, ( b, c ) Image reconstructed with GI and CSGI for different measurement conditions β. ( d - f ) Images reconstructed with GIDL (Ghost Imaging Using Neural Networks) with the number of epochs 10, 100 or 500. Inserts: zoomed images that were reconstructed with CSGI and GIDL of the number '6'.

Full size image

In order to increase the image quality, \ (\ beta \ gg 1 \) is usually set in the conventional GI and CGI framework, so that the image acquisition process is very time-consuming. A popular solution for Shortening the acquisition time is the combination of GI- and the CS theory (Compressive Sensing) 10, 16, 18 . The CS theory allows the object to be accurately recovered from a smaller number of measurements when it is in a presentation domain 29 is thin. Several CSGI frameworks have been demonstrated so far. However, high quality image reconstruction when β is small, ie \ (M \ ll {N} _ {spec} \), is still a challenging problem 10, 16, 18 . In CSGI one actually aims to solve the following inverse problem rather than Eq. (1):

$$ \ mathop {{\ rm {\ min}} \ limits_ {T} \ parallel \ nabla T \ parallel 1+ \ frac {u} {2} \ parallel {\ bf {A}} TS {\ parallel} _ {2} ^ {2}, $$


where ∇ T is the discrete gradient of T, u is a weighting factor between the first term and the second term in Eq. (2) showing the linear model between the image measurement matrix A. and the detected signal vector represents \ (S = {[{S} _ {1}, {S} _ {2}, \ ldots, {S} _ {M}]} ^ {\ perp} \), where the symbol ⊥ denotes the transposition. In this study we solve Eq. (2) Use the open source CS solver TVAL3 30 and reconstruct the images. The images reconstructed in this way are shown in Fig. 1 (c). Due to the low boundary condition, the measurement ratio β can be reduced to 0.1 for a good reconstruction of the object image in our simulation. With a measurement ratio β = 0.4, the object can be found again almost precisely. However, the image reconstructed with a measurement ratio of 0.1 is not so smooth due to the sparse regulation. This problem occurs in CS whenever the number of measurements is small 29 .

In the proposed scheme, the reconstruction is a two-step process. First, the image is extracted from the acquired data directly by solving Eq. (1). As shown in Fig. 1 (b), the image O thus reconstructed is usually very noisy when β is small. But the deep learning is then involved in the second step. The neural network tries to reconstruct the object image T from the noisy or even falsified O. As shown schematically in Figures 2 and 3, GIDL's image reconstruction methods also consist of two steps: training and testing. In the training step, we used a set of 2000 handwritten digits with a size of 32 × 32 pixels from the MNIST database for handwritten digits 31to train the network in our experiments. Some of the digits are shown in Fig. 3. To train the network, we first have the images of the digits in the training set according to Eq. (1). Then we fed these images together with the corresponding basic truth numbers into the neural network and optimized the weighting factors that connect two Nerons in two adjacent hidden layers. In this work we used a deep neural network model (DNN) with two reshaping layers, three hidden layers and an output layer. We used a very simple model to demonstrate. The reshaping layer at the input end shapes the 32 × 32 input speckle pattern into a 1 × 1024 vector. All the hidden layers and the output layer have 1024 neurons. The activation function of these neurons are rectified linear units (ReLU), which in comparison to the sigmoid function 32 a enable faster and more effective training of deep neural architectures on large and complex data sets. The reshaping layer at the output end reshapes the 1 × 1024 vector back into the 32 × 32 image. The loss function and optimization in the DNN model is the mean square error (MSE) and the stochastic gradient descent (SGD). Once the training is over (after 500 epochs in our experiments), the DNN can be used to reconstruct the object image T from O. The program was implemented with Python Version 3.5 and the DNN with Keras Framework based on TensorFlow. The NVIDIA Tesla K20c GPU chip was used to speed up the calculation.

GI's flowchart using deep neural networks. The blue part represents the training phase and the orange part represents the test phase.

Full size image

GI framework with deep neural networks.

Full size image

The simulation results shown in Fig. 1 (d - f) show the images reconstructed with GIDL after 10, 100 or 500 training epochs for different measurement conditions β. From these images we can conclude that: First, as the number of iterations (epochs) increases, the DNN model is better optimized. As a result, the reconstructed images become clearer and brighter. However, when the number of epochs becomes too large, we observed an overfitting of the data, resulting in bit errors in the reconstructed images, as shown by the black spots. Second, the GIDL is not very sensitive to β. The MSE values ​​between the images in FIG. 1 (f) and the corresponding fundamental truth images in FIG. 1 (a) are all around 0.03, even if β = 0.1. This means that by using GIDL for image reconstruction, the number of measurements in the GI acquisition process can be reduced considerably. As a result, the time efficiency can be improved without deteriorating the image quality. We find that the reduction in measurement can also be achieved with the CSGI framework 10 . However, if you take a closer look at the zoomed images of one of the reconstructed digits, for example digit '6', in the inset of 1, you can clearly see that the image reconstructed with CSGI is not as smooth due to the regulation, while GIDL is a much better reconstruction gives. This is an essential difference between the images reconstructed with GIDL and CSGI.

An additional advantage of GIDL over other GI frameworks is its robustness against noise. Now we give a theoretical analysis. For a sufficiently large number of photons, the signal S observed by the single-pixel camera can be used m by an additive random Gaussian noise 18 are shown

$$ {S} _ {m} = \ int {I} _ {m} (x) T (x) {\ rm {d}} x + w {\ sigma} _ {m} {\ varepsilon} _ { m}, $$


where the variance \ ({w} ^ {2} {\ sigma} _ {m} ^ {2} = {w} ^ {2} \ int {I} _ {m} (x) T (x) {\ rm {d}} x \) and ε m is the standard white Gaussian noise. In the variance, w represents the noise level. A larger value of w results in a poorer recognition image. For the speckle field illumination of the same statistic, \ ({\ sigma} _ {m} ^ {2} \) can be viewed as invariant, so that it can be replaced by a constant value \ ({\ sigma} ^ {2 } \ simeq \ Sigma {\ sigma} _ {m} ^ {2} / M \).

The simulation results are shown in FIG. Figure 4 (a) shows the images reconstructed using CSGI under different levels of detection noise. If the noise level w for CSGI is small (w = 1), the reconstructed images are close to the fundamental truths, which means that CSGI can tolerate low level noise. However, if the noise rises to a certain level, the CSGI will fail. The reconstructed image quality is also influenced by the measurement ratio β in the CSGI. In the case of w = 50 and β = 0.1, the images reconstructed by CSGI are completely corrupted by noise according to our simulation.

Noise robustness from GIDL. ( a, b ) Images reconstructed with CSGI and GIDL with different detection noise. Inset: zoomed images of the digit object 5 reconstructed with CSGI and GIDL for a high noise level and a low measurement ratio.

Full size image

In contrast, GIDL has much better performance. Figure 4 (b) shows the images reconstructed using GIDL under different levels of detection noise. In accordance with Figure 2 (d-f), all of the images in Figure 4 (b) are smooth compared to the CGSI reconstructed images. The inset shows the enlarged images of the digit object '5', which were reconstructed with CSGI and GIDL for a high noise level and a low measurement ratio (β = 0.1). It can be seen that the feature details of the digit '5' are clearly discernible in the image reconstructed by GIDL, while it is not from CSGI. This shows the advantage of GIDL over CSGI for imaging and acquisition in harsh environments. Although the sparse constraint can be used to reduce the influence of the random detection noise on a given level, CSGI cannot work for high noise levels, in which case the linear model Eq. (2) is badly affected. In contrast, the deep learning architecture in GIDL takes into account the entire noise when building the network model and adapts the entire partially reconstructed O sharply to the corresponding object image T. However, if the noise level, along with the decrease in & bgr; continues to rise, the effect of the additive noise cannot be completely ignored. As shown by the numerical images "4" and "5" in Fig. 4 (b), the reconstructed image becomes blurred, thereby distorting the feature of the object.


We will now demonstrate the proposed GIDL using a few proof-of-principle experiments. We have chosen a ghost imaging setup as shown in Fig. 5. A laser beam with the wavelength λ = 532 ± 2 nm (Verdi G2 SLM, Coherent, Inc.) was expanded from lens 1 and lens 2 using a 4-f system. An SLM 1 (Pluto-Vis, Holoeye Photonics AG) was used to subsequently display the phase distributions that produce a speckle illumination Im while the objects were displayed on an SLM 2 (Pluto-Vis, Holoeye Photonics AG). The collimated laser beam fell on the SLM 1 and was modulated by the spots displayed on it. The beam reflected by it was projected onto the SLM 2 using the other 4f system, consisting of lens 3 and lens 4. In the construction P1, P2 and P3 are linear polarizers. P1 and P3 are vertically polarized and P2 is horizontally polarized with respect to the laboratory coordinates so that only amplitude modulation is achieved for the SLMs. We have 2 different digits from the MNIST database on SLM 31 that served as objects in our experiments. The beam reflected by the SLM 2 was recorded with an sCMOS camera (Zyla 4.2 PLUS sCMOS, Andor Technology Ltd.) because we do not have a blade detector. We have integrated every recorded intensity pattern to generate S m to create. This does not affect the experimental results, with the exception of the frame rate and the signal gain, since the integration of an intensity pattern recorded by the camera is proportional to the optical power.

Schematic structure of the ghost imaging. P1, P2 and P3 are linear polarizers.

Full size image

In the experiments we used the same training and test set as in the simulation.2000 different digits in the training set were used to train the network. To demonstrate the proposed scheme, we collected a very small amount of data to reconstruct the test digits. In order to accelerate the convergence of the DNN model, we used the optimization Adam and an algorithm for the gradient-based optimization of stochastic objective functions in the training 33 first order instead of SGD. The experimental results for β = 0.1 and β = 0.05 are plotted in FIG. 6. In this figure, the images in the first row are the basic truth images from four places in the test set. Because of the low & bgr; and noise in the system, the reconstructed images are corrupted by noise using the conventional GI, as shown in the second row in FIG. From these reconstructed images, one cannot identify any visible feature of the target digits. However, when we sent them into the trained DNN model, we received the corresponding images shown in the third row in FIG. Although they do not exactly resemble the basic truths, the image reconstructed by GIDL contains enough features to be recognized. In contrast to this, the images reconstructed with CSGI are still recognizable in the measurement ratio β = 0.1, but are completely falsified at β = 0.05. This suggests that GIDL performs better than CSGI when the measurement ratio is low.

Test results under β = 0.1 and β = 0.05. The images in the first row are the basic truth, the second row shows the images reconstructed with GI, the third row shows the predicted objects with GIDL and the last row are the images with CSGI.

Full size image


In summary, we demonstrated GIDL's novel technique with both numerical and optical experiments. We analyzed the performance of conventional GI, CSGI and GIDL under various noise and measurement ratio conditions and found that GIDL performed much better than the others, especially when the measurement ratio β is small. This enables the data acquisition time in ghost imaging to be significantly reduced and offers a promising solution to these challenges that keep GI from practical applications. In addition, our study opens up new possibilities for artificial intellectual techniques in the application of ghost imaging and, even more so, computational imaging.

Change history


This project was funded by the Key Research Program of Frontier Sciences, the Chinese Academy of Sciences (QYZDB-SSW-JSC002), the National Natural Science Foundation (61377005, 61327902), the China Postdoctoral Science Foundation (2015M580356), and the Natural Shanghai Science Foundation ( No. 17ZR1433800).


By submitting a comment, you agree to our terms of use and community guidelines. If you find something abusive or that does not comply with our terms or guidelines, please mark it as inappropriate.