Deep-STORM: super-resolution single-molecule microscopy by deep learning

Elias Nehme; Lucien E. Weiss; Tomer Michaeli; Yoav Shechtman

doi:10.1364/OPTICA.5.000458

1. INTRODUCTION

In conventional microscopy, the spatial resolution of an image is bounded by Abbe’s diffraction limit, corresponding to approximately half the optical wavelength. Super-resolution methods, e.g., stimulated emission depletion [1,2], structured illumination microscopy [3–5], and localization microscopy, namely, photo-activated localization microscopy [(F)PALM] [6,7], and stochastic optical reconstruction microscopy (STORM) [8] have revolutionized biological imaging in the last decade, enabling the observation of cellular structures at the nanoscale [9]. Localization microscopy relies on acquiring a sequence of diffraction-limited images, each containing point-spread functions (PSFs) produced by a sparse set of emitting fluorophores. Next, the emitters are localized with high precision. By combining all of the recovered emitter positions from each frame, a super-resolved image is produced with resolution typically an order of magnitude better than the diffraction limit (down to tens of nanometers).

In localization microscopy, regions with a high density of overlapping emitters pose an algorithmic challenge. This emitter-sparsity constraint leads to a long acquisition time (seconds to minutes), which limits the ability to capture fast dynamics of subwavelength processes within live cells. Various algorithms have been developed to handle overlapping PSFs. Existing classes of algorithms are based on sequential fitting of emitters followed by subtraction of the model PSF [10–13], blinking statistics [14–16], sparsity [17–23], multi-emitter maximum likelihood estimation [24], or even single-image super-resolution by dictionary learning [25,26]. While successful localization of densely spaced emitters has been demonstrated, all existing methods suffer from two fundamental drawbacks: data-processing time and sample-dependent parameter tuning. Even accelerated sparse-recovery methods such as CEL0 [21], which employs the fast FISTA algorithm [27], still involve a time-consuming iterative procedure and scale poorly with the recovered grid size. In addition, current methods rely on parameters that balance different trade-offs in the recovery process. These need to be tuned carefully through trial and error to obtain satisfactory results; ergo, requiring user expertise and tweaking time.

Here we demonstrate precise, fast, parameter-free, super-resolution image reconstruction by harnessing Deep Learning. Convolutional neural networks have shown impressive results in a variety of image processing and computer vision tasks, such as single-image resolution enhancement [28–32] and segmentation [33–35]. In this work, we employ a fully convolutional neural network for super-resolution image reconstruction from dense fields of overlapping emitters. Our method, dubbed Deep-STORM, does not explicitly localize emitters. Instead, it creates a super-resolved image from the raw data directly. The net produces images with reconstruction resolution comparable or better than existing methods; furthermore, the method is extremely fast, and our software can leverage GPU computation for further enhanced speed. Moreover, Deep-STORM is parameter free, requiring no expertise from the user, and is easily implemented for any single-molecule dataset. Importantly, Deep-STORM is general and does not rely on any prior knowledge of the structure in the sample, unlike recently demonstrated, single-shot image enhancement by Deep Learning [36].

2. METHODS

A. Deep Learning

In short, Deep-STORM utilizes an artificial neural net that receives a set of frames of (possibly very dense) point emitters and outputs a set of super-resolved images (one per frame), based on prior training performed on simulated or experimentally obtained images with known emitter positions. The output images are then summed to produce a single super-resolved image.

1. Architecture

The net architecture is based on a fully convolutional encoder–decoder network and was inspired by previous work on cell counting [37]. The network (Fig. 1) first encodes the input intensity image into a dense, aggregated feature representation, through three $3 \times 3$ convolutional layers with increasing depth, interleaved with $2 \times 2$ max-pooling layers (Fig. S1 in Supplement 1). The result is an encoded representation of the data. Afterwards, in the decoding stage, the spatial dimensions are restored to the size of the input image through three successive deconvolution layers, each consisting of $2 \times 2$ upsampling, interleaved with $3 \times 3$ convolutional layers with decreasing depth. Convolutional layers, for both encoding and decoding, refer to a composite of convolution filters, followed by batch normalization [38], and then a ReLU nonlinearity [39]. Since only $3 \times 3$ filters are used, the resulting architecture size is relatively small, only 1.3M trainable parameters. The final pixelwise prediction (super-resolution frame) is created using a depth-reducing $1 \times 1$ convolutional filter with a linear-activation function.

Fig. 1. Network architecture. A set of low-resolution diffraction-limited images of stochastically blinking emitters is fed into the network to produce reconstructed high-resolution images. The resulting outputs are then summed to generate the final super-resolved image.

Download Full Size | PDF

2. Training

Given the camera specifications, PSF model, approximate signal-to-noise ratio (SNR), and the expected emitter density, twenty $64 \times 64$ pixel images containing randomly positioned emitters are simulated using the ImageJ [40,41] ThunderSTORM plugin [42]. From each frame we extract 500 random $26 \times 26$ regions and their respective ground truth $x y$ emitter positions. To generate the final training examples we upsample each region by a factor of 8 and project the appropriate emitter positions on the high-resolution grid. The result is a set of 10 K pairs of upsampled low-resolution regions ( $208 \times 208$ pixels) alongside images with spikes at the ground truth positions, used as training examples. Each region is normalized using the mean and averaged standard deviation (per-region) of the dataset without additional data augmentation. An example training input image and the corresponding output (after training) are shown in Fig. 2.

Fig. 2. Simulated dense emitters. (a) Low-resolution image. Scale bar is 0.5 μm. (b) Deep-STORM prediction on a 12.5 nm grid with ground truth emitter locations overlaid as cross marks on top.

Download Full Size | PDF

3. Loss Function

Unlike typical localization-microscopy approaches, Deep-STORM directly outputs the super-resolved images rather than a localization list. Therefore, as a loss function for training the net, we adapt a regression approach. Specifically, we measure the squared $ℓ_{2}$ distance between the network’s prediction and the ground truth image (consisting of delta functions in the emitter positions) convolved with a small 2D Gaussian kernel. To promote sparsity of the network’s output, we also introduce an $ℓ_{1}$ penalizer. Let $x_{i}$ be the image with delta functions at the ground truth positions, ${\hat{x}}_{i}$ be the network’s prediction, $g$ the Gaussian kernel, $N$ the number of images in the training set, and $⊛$ denote convolution. Then the resulting loss function is

ℓ (x, \hat{x}) = \frac{1}{N} Σ_{i = 1}^{N} {‖ {\hat{x}}_{i} ⊛ g - x_{i} ⊛ g ‖}_{2}^{2} + {‖ {\hat{x}}_{i} ‖}_{1} .

It is possible to incorporate a regularization parameter to the $ℓ_{1}$ term controlling the desired sparsity level; however, we observed high robustness of the resulting predictions to such a parameter. Hence, we chose to keep Deep-STORM parameter free. The network was implemented in Keras [43] with a TensorFlow [44] backend. We trained the network for 100 epochs on batches of 16 samples using an Adam [45] optimizer with the default parameters, a Gaussian kernel with $σ = 1$ pixel, and an initial learning rate of 0.001. Training and evaluation were run on a standard workstation equipped with 32 GB of memory, an Intel(R) Core(TM) $i 7 - 8700$ , 3.20 GHz CPU, and a NVidia GeForce Titan Xp GPU with 12 GB of video memory. Full network training took $\sim 2 h$ . Our code is made publicly available (Code 1 [46]).

B. Microscopy

Quantum dot (QD) samples were prepared by diluting 705 nm emitting QDs (Invitrogen) $1 ∶ 1000 v / v$ in 1% poly(vinyl alcohol) (Mowiol 8-88, Sigma-Aldrich) then spin coating onto a standard glass coverslip (no. 1.5, Fisher Scientific). Images were recorded using Nikon imaging software, which controlled a standard inverted microscope (Eclipse TI, Nikon) with a 405 nm light source (iChrome MLE, Toptica). Fluorescence emission from the QDs was collected using a high numerical aperture (1.49), $100 \times$ objective lens (CFI Apochromat TIRF 100XC Oil, Nikon), chromatically filtered to remove background (ZT488rdc & ET500LP, Chroma), and then captured with a 400 ms exposure time on an sCMOS camera (95B, Photometrics). To achieve a variety of SNRs and emitter densities, images were taken with various laser powers and combined in post processing.

3. RESULTS

We validated Deep-STORM on both simulated and experimental data. All microtubule reconstructions were obtained on a grid with a 12.5 nm pixel size, and QD reconstructions were obtained on a grid with a 13.75 nm pixel size. In order to estimate the expected resolution of the net’s output, we simulated reconstruction of a synthetic structure of horizontal stripes at decreasing separations, on various emitter densities, using nets that were trained accordingly, for a reasonable single-molecule level SNR of 1000 signal photons and 10 background photons per pixel (Fig. 3). Notably, the minimal resolvable distance between stripes increases as a function of emitter density, ranging from at least 19 nm for $1 [\frac{emitter}{{μm}^{2}}]$ to 31 nm for $9 [\frac{emitter}{{μm}^{2}}]$ . A similar resolution analysis for various SNR values is included in Section 3 in Supplement 1.

Fig. 3. Resolution and emitter density (simulation). (a) Diffraction-limited image of horizontal lines. Scale bar is 500 nm. (b) Simulated single frames of emitters at various densities with a mean of 10 background photons per pixel and 1000 signal photons per emitter. (c) The ground truth positions of simulated emitters. (d) Deep-STORM reconstructed images. (e) Sum along the horizontal axis of the reconstruction intensities.

Download Full Size | PDF

Next, we tested Deep-STORM on super-resolution data and benchmarked against a recently developed high-performance multi-emitter fitting algorithm (CEL0 [21]). First, we reconstructed a simulated microtubule dataset available on the EPFL SMLM challenge website [47] (Fig. 4). The optimal regularization parameter for CEL0 was set empirically to $λ = 0.25$ through a comprehensive trial and error process, such that spurious detections are minimized, and the number of recovered positions was roughly equal to the number of underlying emitters. The number of IRL1 and inner FISTA iterations was set to 200. Since Deep-STORM is not constrained to output emitter positions, we quantified the quality of the results based on image similarity measures rather than a point-list comparison, e.g., Jaccard index. Specifically, we used the standard normalized mean square error $NMSE (\hat{x}, x) = \frac{{‖ \hat{x} - x ‖}_{2}^{2}}{{‖ x ‖}_{2}^{2}} \times 100 %$ . Deep-STORM showed improved NMSE of 37% compared to 72% for the CEL0 raw histogram and 69% for the CEL0 result convolved with a Gaussian with $σ = 1$ pixel, where $σ = 1$ was optimized to produce the lowest NMSE. Deep-STORM managed to resolve nearby microtubule edges (Fig. 4) and accurately recovered the underlying structure curvature compared to CEL0 (highlighted in white arrows in Fig. 5). To quantify the resolution, we analyzed simulated frames containing many molecules along a line and used the trained net from above. The line width (FWHM) was 24 nm (Fig. S6 in Supplement 1).

Fig. 4. Simulated microtubules. (a) Sum of the acquisition stack. Scale bar is 1 μm. (b) Ground truth. (c) Reconstruction by the CEL0 method (d) Reconstruction by Deep-STORM. (e), (f) Magnified views of two selected regions. Scale bars are 0.5 μm.

Download Full Size | PDF

Fig. 5. Reconstruction accuracy. (a) Ground truth image of simulated microtubules. Scale bar is 1 μm. (b) Merged reconstruction with the ground truth. Red shows the ground truth, green corresponds to the recovery result, and yellow marks their overlap. Note that CEL0 (left) does not follow the twisted shape in all places (white triangles), while Deep-STORM (right) better recovers the underlying structure.

Download Full Size | PDF

Second, we tested the result of Deep-STORM on experimental data obtained from Sage et al. [47], training solely on simulated data with similar experimental conditions—namely, SNR and emitter density. Deep-STORM resolves nearby lines and fine structures and produces more continuous shapes compared to the output of CEL0 (Fig. 6). Both simulated and experimental datasets were also compared to a fast multi-emitter fitting algorithm (FALCON [20]). The results show that Deep-STORM is also superior to FALCON on both datasets, with 37% compared to 61% NMSE on the simulated dataset and better resolved structures in the experimental dataset (Figs. S3, S4, and S5 in Supplement 1).

Fig. 6. Experimentally measured microtubules. (a) Sum of the acquisition stack. Scale bar is 2 μm. (b) Reconstruction by the CEL0 method. (c) Reconstruction by Deep-STORM. (d), (e) Magnified views of two selected regions. Scale bars are 0.5 μm. (f) The width projection of the highlighted yellow region. The attained FWHM (black triangles) for CEL0 was 61 nm and 67 nm for Deep-STORM. The black line shows the diffraction-limited projection.

Download Full Size | PDF

Ultimately, the best training set should include the aberrations in the experimental imaging system; however, very large datasets are typically used to train a deep neural network, and obtaining massive amounts of experimental images is not straightforward. However, we found that a reasonable number of experimental images are sufficient to train a high-quality net. We trained and tested Deep-STORM on a sample containing randomly scattered fluorescent quantum dots to evaluate the performance on experimental data with a variety of SNR conditions encountered in single-molecule datasets and at high density. To obtain a high-density dataset with relatively well-known positions, we first acquired 100 images of sparse, randomly distributed quantum dots (a total of 1560 emitters) and then localized them with high precision using ThunderSTORM [42]. The sparse frames were next cropped into smaller regions and summed to generate dense regions for training (1200 regions) or evaluation (360 regions). Specifically, we chose eight random regions at a time and summed them. Notably, by combining and shifting portions of only 100 images, we produced a library of 10 K summed regions that was used to train the network and 3 K for testing. The resulting imaging conditions were challenging: the emitter density of the regions was around $2 [\frac{emitter}{{μm}^{2}}]$ , there were 2500 mean signal photons per emitter, and total additive Gaussian noise with a standard deviation of $σ = 20$ photons per pixel. In the 3 K regions reserved for evaluation, Deep-STORM correctly identified 96% of emitters localized by ThunderSTORM prior to combining frames, with a low false positive rate of 1.6% (Fig. 7). In these conditions, Deep-STORM generates super-resolved images containing small “blobs”, usually in $3 \times 3$ pixel regions, with the peak being at the center. For nearby emitters, Deep-STORM produces a slightly asymmetric blob. The minor blur is also apparent in the previous examples; however, it has little effect on the resulting super-resolved image (e.g., see Fig. 6).

Fig. 7. Quantum dot experimental data. (a) Acquired low-resolution image. Scale bar is 1 μm. (b) Deep-STORM reconstruction with ground truth emitter positions (red crosses). (c) Magnified view of the selected region in (b).

Download Full Size | PDF

Comparing to reconstruction of the same images using a net trained on simulated data (as described above), we found that the experimentally trained net outperforms the simulated net, detecting 96% compared to 88% of the emitters, with a reduced false positive rate of 1.6% compared to 8.7%. This test demonstrates that while simulated data can serve as excellent training data, experimentally obtained images are even better. Additionally, a high-quality reconstruction net can be trained using a small number of experimentally measured images.

Finally, we tested the robustness of our method to a mismatch between the training data and the measured image. We found that Deep-STORM is relatively robust to a density mismatch of $\sim 2 [\frac{emitter}{{μm}^{2}}]$ (Fig. S8 in Supplement 1). In addition, we found that in the case of a mismatch in SNR, it is preferable to train on lower background examples to prevent a high false positives rate (Fig. S9 in Supplement 1).

Deep-STORM not only yields image reconstruction results that are comparable to or better than leading algorithms but also does so $\sim 1 - 3$ orders of magnitude faster. Table 1 compares the runtime of Deep-STORM versus CEL0 and FALCON on both simulated and experimental microtubule datasets (Figs. 4 and 6). The simulated dataset consists of 361 frames containing $\sim 81 K$ emitters in total, with mean density of $5.48 [\frac{emitter}{{μm}^{2}}]$ . The experimental dataset consists of 500 frames containing $\sim 520 K$ emitters, with mean density of $6.35 [\frac{emitter}{{μm}^{2}}]$ , approximated using the number of emitters recovered by CEL0. Deep-STORM exhibits significantly superior runtime, especially when introducing GPU acceleration, equivalent to localizing $\sim 20, 000$ emitters per second, compared to $\sim 1500$ emitters per second by the fastest existing multi-emitter fitting method, to the best of our knowledge (FALCON [20]).

Table 1. Runtime Comparison

View Table

4. DISCUSSION

Since the introduction of single-molecule localization microscopy, numerous algorithms have been developed to reconstruct super-resolved images from movies of stochastically blinking emitters. In particular, considerable effort has been invested in solving the high-density emitter-fitting problem. Indeed, current methods for multi-emitter fitting produce high-quality images; however, this comes at a high computational cost, i.e., runtime, as well as frequently necessitating parameter tuning. In this work, we have presented a fast, precise, and parameter-free method for super-resolution imaging from localization-microscopy-type data. Deep-STORM uses a convolutional neural network trained on easily simulated or experimental data. Our experiments show that the net used in this work performs well up to a density of $\sim 6 [\frac{emitter}{{μm}^{2}}]$ , which is similar to leading multi-emitter fitting methods, after tuning their parameters accordingly. We note that, in general, the maximal allowable density would depend also on SNR. Notably, the main reason deep learning is highly suitable for the application presented in this work is the simplicity in which training data can be generated. Namely, single-molecule images with realistic noise models are straightforward to simulate in large numbers, which are often required in deep learning.

Our simulations show that Deep-STORM exhibits high robustness to emitter density and SNR used for training (Supplement 1); nevertheless, in order to further increase performance for cases such as time-varying emitter densities or signal/background levels, the following simple generalization can be considered. Since training of the net is performed offline, a pretraining of a set of nets for various SNRs and density values can be performed once. Then, in the reconstruction stage, a fast optimal selection step can be applied to each captured frame, routing it to the best net, considering the estimated SNR and emitter density of the current frame.

Although Deep-STORM uses localization-microscopy-type movies to produce a super-resolved image, it is not a localization-based technique. Localization microscopy is based on the additional information inherent in blinking molecules. However, as was demonstrated by other techniques, e.g., SOFI [14], extracting this information does not necessarily require compiling a list of molecular positions. Deep-STORM implicitly uses this additional information content to directly reconstruct a super-resolved image. The technique combines state-of-the-art resolution enhancement, unprecedented speed, and high flexibility (parameter-free operation). This combination produces a technique capable of video-rate analysis of super-resolution localization-microscopy data that requires no expertise from the end user, overcoming some of the most significant limitations of existing localization methods.

Funding

Google; Zuckerman Foundation; Technion-Israel Institute of Technology; Ollendorf Foundation; Taub Foundation; Israel Science Foundation (ISF) (852/17); Israel Academy of Sciences and Humanities.

Acknowledgment

The authors thank Dr. Daniel Freedman of Google Research for fruitful discussions. We also gratefully acknowledge the support of the NVIDIA Corporation for the donation of the Titan Xp GPU used for this research. E. N. is supported by a Google research award; L. E. W. and Y. S. are supported by the Zuckerman Foundation; Y. S. is supported in part by a Career Advancement Chairship from the Technion-Israel Institute of Technology; T. M. is supported in part by the Ollendorf Foundation; Taub Foundation (through a Horev fellowship), an Alon Fellowship; Israel Science Foundation (ISF); Israel Academy of Sciences and Humanities.

See Supplement 1 for supporting content.

REFERENCES

1. S. W. Hell and J. Wichmann, “Breaking the diffraction resolution limit by stimulated emission: stimulated-emission-depletion fluorescence microscopy,” Opt. Lett. 19, 780–782 (1994). [CrossRef]

2. T. A. Klar and S. W. Hell, “Subdiffraction resolution in far-field fluorescence microscopy,” Opt. Lett. 24, 954–956 (1999). [CrossRef]

3. M. A. A. Neil, R. Juškaitis, and T. Wilson, “Method of obtaining optical sectioning by using structured light in a conventional microscope,” Opt. Lett. 22, 1905–1907 (1997). [CrossRef]

4. W. Lukosz and M. Marchand, “Optischen abbildung unter Überschreitung der beugungsbedingten auflösungsgrenze,” Opt. Acta 10, 241–255 (1963). [CrossRef]

5. M. Gustafsson, “Surpassing the lateral resolution limit by a factor of two using structured illumination microscopy,” J. Microsc. 198, 82–87 (2000). [CrossRef]

6. E. Betzig, G. H. Patterson, R. Sougrat, O. W. Lindwasser, S. Olenych, J. S. Bonifacino, M. W. Davidson, J. Lippincott-Schwartz, and H. F. Hess, “Imaging intracellular fluorescent proteins at nanometer resolution,” Science 313, 1642–1645 (2006). [CrossRef]

7. S. T. Hess, T. P. Girirajan, and M. D. Mason, “Ultra-high resolution imaging by fluorescence photoactivation localization microscopy,” Biophys. J. 91, 4258–4272 (2006). [CrossRef]

8. M. J. Rust, M. Bates, and X. Zhuang, “Sub-diffraction-limit imaging by stochastic optical reconstruction microscopy (STORM),” Nat. Methods 3, 793–795 (2006). [CrossRef]

9. S. J. Sahl and W. Moerner, “Super-resolution fluorescence imaging with single molecules,” Curr. Opin. Struct. Biol. 23, 778–787 (2013). [CrossRef]

10. J. Högbom, “Aperture synthesis with a non-regular distribution of interferometer baselines,” Astron. Astrophys. Suppl. Ser. 15, 417–426 (1974).

11. A. Sergé, N. Bertaux, H. Rigneault, and D. Marguet, “Dynamic multiple-target tracing to probe spatiotemporal cartography of cell membranes,” Nat. Methods 5, 687–694 (2008). [CrossRef]

12. X. Qu, D. Wu, L. Mets, and N. F. Scherer, “Nanometer-localized multiple single-molecule fluorescence microscopy,” Proc. Natl. Acad. Sci. USA 101, 11298–11303 (2004). [CrossRef]

13. M. P. Gordon, T. Ha, and P. R. Selvin, “Single-molecule high-resolution imaging with photobleaching,” Proc. Natl. Acad. Sci. USA 101, 6462–6465 (2004). [CrossRef]

14. T. Dertinger, R. Colyer, G. Iyer, S. Weiss, and J. Enderlein, “Fluctuation imaging (SOFI),” Proc. Natl. Acad. Sci. USA 106, 22287–22292 (2009). [CrossRef]

15. S. Cox, E. Rosten, J. Monypenny, T. Jovanovic-Talisman, D. T. Burnette, J. Lippincott-Schwartz, G. E. Jones, and R. Heintzmann, “Bayesian localization microscopy reveals nanoscale podosome dynamics,” Nat. Methods 9, 195–200 (2012). [CrossRef]

16. N. Gustafsson, S. Culley, G. Ashdown, D. M. Owen, P. M. Pereira, and R. Henriques, “Fast live-cell conventional fluorophore nanoscopy with ImageJ through super-resolution radial fluctuations,” Nat. Commun. 7, 12471 (2016). [CrossRef]

17. S. J. Holden, S. Uphoff, and A. N. Kapanidis, “DAOSTORM: an algorithm for high-density super-resolution microscopy,” Nat. Methods 8, 279–280 (2011). [CrossRef]

18. L. Zhu, W. Zhang, D. Elnatan, and B. Huang, “Faster STORM using compressed sensing,” Nat. Methods 9, 721–723 (2012). [CrossRef]

19. A. Barsic, G. Grover, and R. Piestun, “Three-dimensional super-resolution and localization of dense clusters of single molecules,” Sci. Rep. 4, 5388 (2014). [CrossRef]

20. J. Min, C. Vonesch, H. Kirshner, L. Carlini, N. Olivier, S. Holden, S. Manley, J. C. Ye, and M. Unser, “FALCON: fast and unbiased reconstruction of high-density super-resolution microscopy data,” Sci. Rep. 4, 4577 (2015). [CrossRef]

21. S. Gazagnes, E. Soubies, and L. Blanc-Féraud, “High density molecule localization for super-resolution microscopy using CEL0 based sparse approximation,” in IEEE International Symposium on Biomedical Imaging (ISBI) (2017), p. 4.

22. S. Hugelier, J. J. De Rooi, R. Bernex, S. Duwé, O. Devos, M. Sliwa, P. Dedecker, P. H. Eilers, and C. Ruckebusch, “Sparse deconvolution of high-density super-resolution images,” Sci. Rep. 6, 21413 (2016). [CrossRef]

23. O. Solomon, Y. C. Eldar, M. Mutzafi, and M. Segev, “Sparcom: sparsity based super-resolution correlation microscopy,” arXiv:1707.09255 (2017).

24. F. Huang, S. L. Schwartz, J. M. Byars, and K. A. Lidke, “Simultaneous multiple-emitter fitting for single molecule super-resolution imaging,” Biomed. Opt. Express 2, 1377–1393 (2011). [CrossRef]

25. M. Mutzafi, Y. Shechtman, Y. C. Eldar, and M. Segev, “Single-shot sparsity-based sub-wavelength fluorescence imaging of biological structures using dictionary learning,” in Conference on Lasers and Electro-Optics (CLEO) (2015), Vol. 3, paper STh4K.5.

26. M. Mutzafi, Y. Shechtman, O. Dicker, L. Weiss, Y. C. Eldar, W. E. Moerner, M. Segev, and M. Segev, “Experimental demonstration of sparsity-based single-shot fluorescence imaging at sub-wavelength resolution,” in Conference on Lasers and Electro-Optics (2017), Vol. 4, paper AW1A.6.

27. A. Beck and M. Teboulle, “A fast iterative shrinkage-thresholding algorithm,” Soc. Indust. Appl. Math. J. Imaging Sci. 2, 183–202 (2009).

28. C. Dong, C. C. Loy, K. He, and X. Tang, “Image super-resolution using deep convolutional networks,” IEEE Trans. Pattern Anal. Mach. Intell. 38, 295–307 (2016). [CrossRef]

29. J. Kim, J. K. Lee, and K. M. Lee, “Accurate image super-resolution using very deep convolutional networks,” in Computer Vision and Pattern Recognition (CVPR) (2016), pp. 1646–1654.

30. Z. Wang, D. Liu, J. Yang, W. Han, and T. Huang, “Deep networks for image super-resolution with sparse prior,” in Proceedings of the IEEE International Conference on Computer Vision (2016), pp. 370–378.

31. X. Mao, C. Shen, and Y.-B. Yang, “Image restoration using very deep convolutional encoder–decoder networks with symmetric skip connections,” in Advances in Neural Information Processing Systems, D. D. Lee, M. Sugiyama, U. V. Luxburg, I. Guyon, and R. Garnett, eds. (Curran Associates, 2016), pp. 2802–2810.

32. Y. Rivenson, Z. Göröcs, H. Günaydin, Y. Zhang, H. Wang, and A. Ozcan, “Deep learning microscopy,” Optica 4, 1437–1443 (2017). [CrossRef]

33. J. Long, E. Shelhamer, and T. Darrell, “Fully convolutional networks for semantic segmentation,” in Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (2015), pp. 3431–3440.

34. O. Ronneberger, P. Fischer, and T. Brox, “U-net: convolutional networks for biomedical image segmentation,” in International Conference on Medical Image Computing and Computer-Assisted Intervention (Springer, 2015), pp. 234–241.

35. H. Noh, S. Hong, and B. Han, “Learning deconvolution network for semantic segmentation,” in Proceedings of the IEEE International Conference on Computer Vision (ICCV) (2015), pp. 1520–1528.

36. M. Weigert, U. Schmidt, T. Boothe, A. Müller, A. Dibrov, A. Jain, B. Wilhelm, D. Schmidt, C. Broaddus, S. Culley, M. Rocha-Martins, F. Segovia-Miranda, C. Norden, R. Henriques, M. Zerial, M. Solimena, J. Rink, P. Tomancak, L. Royer, F. Jug, and E. W. Myers, “Content-aware image restoration: pushing the limits of fluorescence microscopy,” bioRxiv (2017).

37. W. Xie, J. A. Noble, and A. Zisserman, “Microscopy cell counting and detection with fully convolutional regression networks,” in Computer Methods in Biomechanics and Biomedical Engineering: Imaging and Visualization (2016), pp. 1–10.

38. S. Ioffe and C. Szegedy, “Batch normalization: accelerating deep network training by reducing internal covariate shift,” in Proceedings of the 32nd International Conference on Machine Learning, F. Bach and D. Blei, eds., Vol. 37 of Proceedings of Machine Learning Research (PMLR, 2015), pp. 448–456.

39. A. L. Maas, A. Y. Hannun, and A. Y. Ng, “Rectifier nonlinearities improve neural network acoustic models,” in Proceedings of the 30th International Conference on Machine Learning (2013), Vol. 28, p. 6.

40. C. T. Rueden, J. Schindelin, M. C. Hiner, B. E. DeZonia, A. E. Walter, E. T. Arena, and K. W. Eliceiri, “Imagej2: Imagej for the next generation of scientific image data,” BMC Bioinf. 18, 529 (2017). [CrossRef]

41. J. Schindelin, I. Arganda-Carreras, E. Frise, V. Kaynig, M. Longair, T. Pietzsch, S. Preibisch, C. Rueden, S. Saalfeld, B. Schmid, J.-Y. Tinevez, D. J. White, V. Hartenstein, K. Eliceiri, P. Tomancak, and A. Cardona, “Fiji: an open-source platform for biological-image analysis,” Nat. Methods 9, 676–682 (2012). [CrossRef]

42. M. Ovesný, P. Křížek, J. Borkovec, Z. Švindrych, and G. M. Hagen, “ThunderSTORM: a comprehensive ImageJ plug-in for PALM and STORM data analysis and super-resolution imaging,” Bioinformatics 30, 2389–2390 (2014). [CrossRef]

43. F. Chollet, “Keras,” https://github.com/fchollet/keras (2015).

44. M. Abadi, A. Agarwal, P. Barham, E. Brevdo, Z. Chen, C. Citro, G. S. Corrado, A. Davis, J. Dean, M. Devin, S. Ghemawat, I. Goodfellow, A. Harp, G. Irving, M. Isard, Y. Jia, R. Jozefowicz, L. Kaiser, M. Kudlur, J. Levenberg, D. Mané, R. Monga, S. Moore, D. Murray, C. Olah, M. Schuster, J. Shlens, B. Steiner, I. Sutskever, K. Talwar, P. Tucker, V. Vanhoucke, V. Vasudevan, F. Viégas, O. Vinyals, P. Warden, M. Wattenberg, M. Wicke, Y. Yu, and X. Zheng, “TensorFlow: large-scale machine learning on heterogeneous systems,” software available from http://www.tensorflow.org (2015).

45. D. P. Kingma and J. Ba, “Adam: a method for stochastic optimization,” arXiv:1412.6980 (2014).

46. “Nano-bio-optics lab—Yoav Shechtman,” http://nanobiooptics.net.technion.ac.il/.

47. D. Sage, H. Kirshner, T. Pengo, N. Stuurman, J. Min, S. Manley, and M. Unser, “Quantitative evaluation of software packages for single-molecule localization microscopy,” Nat. Methods 12, 717–724 (2015). [CrossRef]

		CEL0	FALCON	Deep-STORM
Dataset	Grid Size	CPU [s]	CPU/GPU [s]	CPU/GPU [s]
Sim.	$512 \times 512$	18677	$1465 / 122$	$123 / 4$
Exp.	$1024 \times 1024$	175200	$10177 / 434$	$715 / 27$

Deep-STORM: super-resolution single-molecule microscopy by deep learning

Abstract

1. INTRODUCTION

2. METHODS

A. Deep Learning

1. Architecture

2. Training

3. Loss Function

B. Microscopy

3. RESULTS

4. DISCUSSION

Funding

Acknowledgment

REFERENCES

Supplementary Material (1)

Cited By

Figures (7)

Tables (1)

Equations (1)

Optica