uu.seUppsala University Publications
Change search
Link to record
Permanent link

Direct link
BETA
Publications (10 of 47) Show all publications
Ayyalasomayajula, K. R., Wilkinson, T., Malmberg, F. & Brun, A. (2019). CalligraphyNet: Augmenting handwriting generation with quill based stroke width. Paper presented at 26th IEEE International Conference on Image Processing.
Open this publication in new window or tab >>CalligraphyNet: Augmenting handwriting generation with quill based stroke width
2019 (English)Manuscript (preprint) (Other academic)
Abstract [en]

Realistic handwritten document generation garners a lot ofinterest from the document research community for its abilityto generate annotated data. In the current approach we haveused GAN-based stroke width enrichment and style transferbased refinement over generated data which result in realisticlooking handwritten document images. The GAN part of dataaugmentation transfers the stroke variation introduced by awriting instrument onto images rendered from trajectories cre-ated by tracking coordinates along the stylus movement. Thecoordinates from stylus movement are augmented with thelearned stroke width variations during the data augmentationblock. An RNN model is then trained to learn the variationalong the movement of the stylus along with the stroke varia-tions corresponding to an input sequence of characters. Thismodel is then used to generate images of words or sentencesgiven an input character string. A document image thus cre-ated is used as a mask to transfer the style variations of the inkand the parchment. The generated image can capture the colorcontent of the ink and parchment useful for creating annotated data.

National Category
Computer Systems
Research subject
Computerized Image Processing
Identifiers
urn:nbn:se:uu:diva-379633 (URN)
Conference
26th IEEE International Conference on Image Processing
Note

Currently under review

Available from: 2019-03-19 Created: 2019-03-19 Last updated: 2019-04-08
Ayyalasomayajula, K. R., Malmberg, F. & Brun, A. (2019). PDNet: Semantic segmentation integrated with a primal-dual network for document binarization. Pattern Recognition Letters, 121, 52-60
Open this publication in new window or tab >>PDNet: Semantic segmentation integrated with a primal-dual network for document binarization
2019 (English)In: Pattern Recognition Letters, ISSN 0167-8655, E-ISSN 1872-7344, Vol. 121, p. 52-60Article in journal (Refereed) Published
National Category
Computer Vision and Robotics (Autonomous Systems)
Research subject
Computerized Image Processing
Identifiers
urn:nbn:se:uu:diva-366933 (URN)10.1016/j.patrec.2018.05.011 (DOI)000459876700008 ()
Funder
Swedish Research Council, 2012-5743Riksbankens Jubileumsfond, NHS14-2068:1
Available from: 2018-05-16 Created: 2018-11-27 Last updated: 2019-04-04Bibliographically approved
Ayyalasomayajula, K. R. & Brun, A. (2017). Document Binarization Combining with Graph Cuts and Deep Neural Networks. In: : . Paper presented at 36th Swedish Symposium on Image Analysis (SSBA) 2017.
Open this publication in new window or tab >>Document Binarization Combining with Graph Cuts and Deep Neural Networks
2017 (English)Conference paper, Published paper (Other academic)
National Category
Computer Vision and Robotics (Autonomous Systems)
Research subject
Computerized Image Processing
Identifiers
urn:nbn:se:uu:diva-336171 (URN)
Conference
36th Swedish Symposium on Image Analysis (SSBA) 2017
Funder
Swedish Research Council, 2012-5743Riksbankens Jubileumsfond, NHS14-2068:1
Available from: 2017-12-12 Created: 2017-12-12 Last updated: 2018-01-13Bibliographically approved
Ayyalasomayajula, K. R. & Brun, A. (2017). Historical document binarization combining semantic labeling and graph cuts. In: Image Analysis: Part I. Paper presented at SCIA 2017, June 12–14, Tromsø, Norway (pp. 386-396). Springer
Open this publication in new window or tab >>Historical document binarization combining semantic labeling and graph cuts
2017 (English)In: Image Analysis: Part I, Springer, 2017, p. 386-396Conference paper, Published paper (Refereed)
Abstract [en]

Most data mining applications on collections of historical documents require binarization of the digitized images as a pre-processing step. Historical documents are often subjected to degradations such as parchment aging, smudges and bleed through from the other side. The text is sometimes printed, but more often handwritten. Mathematical modeling of appearance of the text, background and all kinds of degradations, is challenging. In the current work we try to tackle binarization as pixel classification problem. We first apply semantic segmentation, using fully convolutional neural networks. In order to improve the sharpness of the result, we then apply a graph cut algorithm. The labels from the semantic segmentation are used as approximate estimates of the text and background, with the probability map of background used for pruning the edges in the graph cut. The results obtained show significant improvement over the state of the art approach.

Place, publisher, year, edition, pages
Springer, 2017
Series
Lecture Notes in Computer Science, ISSN 0302-9743 ; 10269
National Category
Computer Vision and Robotics (Autonomous Systems)
Research subject
Computerized Image Processing
Identifiers
urn:nbn:se:uu:diva-335335 (URN)10.1007/978-3-319-59126-1_32 (DOI)000454359300032 ()978-3-319-59125-4 (ISBN)
Conference
SCIA 2017, June 12–14, Tromsø, Norway
Funder
Swedish Research Council, 2012-5743Riksbankens Jubileumsfond, NHS14-2068:1
Available from: 2017-05-19 Created: 2017-12-04 Last updated: 2019-03-19Bibliographically approved
Wilkinson, T., Lindström, J. & Brun, A. (2017). Neural Ctrl-F: Segmentation-free Query-by-String Word Spotting in Handwritten Manuscript Collections. In: : . Paper presented at Swedish Symposium on Deep Learning (SSDL).
Open this publication in new window or tab >>Neural Ctrl-F: Segmentation-free Query-by-String Word Spotting in Handwritten Manuscript Collections
2017 (English)Conference paper, Poster (with or without abstract) (Other academic)
National Category
Computer Vision and Robotics (Autonomous Systems)
Research subject
Computerized Image Processing
Identifiers
urn:nbn:se:uu:diva-336188 (URN)
Conference
Swedish Symposium on Deep Learning (SSDL)
Projects
q2b
Funder
Swedish Research Council, 2012-5743Riksbankens Jubileumsfond, NHS14-2068:1
Available from: 2017-12-12 Created: 2017-12-12 Last updated: 2018-01-13
Wilkinson, T., Lindström, J. & Brun, A. (2017). Neural Ctrl-F: Segmentation-free query-by-string word spotting in handwritten manuscript collections. In: 2017 IEEE International Conference on Computer Vision (ICCV): . Paper presented at 16th IEEE International Conference on Computer Vision (ICCV), Venice, Italy, October 22-29, 2017 (pp. 4443-4452). IEEE
Open this publication in new window or tab >>Neural Ctrl-F: Segmentation-free query-by-string word spotting in handwritten manuscript collections
2017 (English)In: 2017 IEEE International Conference on Computer Vision (ICCV), IEEE, 2017, p. 4443-4452Conference paper, Published paper (Refereed)
Abstract [en]

In this paper, we approach the problem of segmentation-free query-by-string word spotting for handwritten documents. In other words, we use methods inspired from computer vision and machine learning to search for words in large collections of digitized manuscripts. In particular, we are interested in historical handwritten texts, which are often far more challenging than modern printed documents. This task is important, as it provides people with a way to quickly find what they are looking for in large collections that are tedious and difficult to read manually. To this end, we introduce an end-to-end trainable model based on deep neural networks that we call Ctrl-F-Net. Given a full manuscript page, the model simultaneously generates region proposals, and embeds these into a distributed word embedding space, where searches are performed. We evaluate the model on common benchmarks for handwritten word spotting, outperforming the previous state-of-the-art segmentation-free approaches by a large margin, and in some cases even segmentation-based approaches. One interesting real-life application of our approach is to help historians to find and count specific words in court records that are related to women's sustenance activities and division of labor. We provide promising preliminary experiments that validate our method on this task.

Place, publisher, year, edition, pages
IEEE, 2017
Series
IEEE International Conference on Computer Vision, E-ISSN 1550-5499
Keywords
Segmentation-free Word Spotting, Deep Learning, Convolutional Neural Network, Query-by-String
National Category
Computer Vision and Robotics (Autonomous Systems)
Research subject
Computerized Image Processing
Identifiers
urn:nbn:se:uu:diva-335926 (URN)10.1109/ICCV.2017.475 (DOI)000425498404054 ()978-1-5386-1032-9 (ISBN)
Conference
16th IEEE International Conference on Computer Vision (ICCV), Venice, Italy, October 22-29, 2017
Projects
q2b
Funder
Swedish Research Council, 2012-5743Riksbankens Jubileumsfond, NHS14-2068:1
Available from: 2017-12-11 Created: 2017-12-11 Last updated: 2019-04-08Bibliographically approved
Svensson, L., Svensson, S., Nyström, I., Nysjö, F., Nysjö, J., Laloeuf, A., . . . Sintorn, I.-M. (2017). ProViz: a tool for explorative 3-D visualization and template matching in electron tomograms. COMPUTER METHODS IN BIOMECHANICS AND BIOMEDICAL ENGINEERING-IMAGING AND VISUALIZATION, 5(6), 446-454
Open this publication in new window or tab >>ProViz: a tool for explorative 3-D visualization and template matching in electron tomograms
Show others...
2017 (English)In: COMPUTER METHODS IN BIOMECHANICS AND BIOMEDICAL ENGINEERING-IMAGING AND VISUALIZATION, ISSN 2168-1163, Vol. 5, no 6, p. 446-454Article in journal (Refereed) Published
Abstract [en]

Visual understanding is a key aspect when studying electron tomography data-sets, aside quantitative assessments such as registration of high-resolution structures. We here present the free software tool ProViz (Protein Visualization) for visualisation and templatematching in electron tomograms of biological samples. The ProViz software contains methods and tools which we have developed, adapted and computationally optimised for easy and intuitive visualisation and analysis of electron tomograms with low signal-to-noise ratio. ProViz complements existing software in the application field and serves as an easy and convenient tool for a first assessment and screening of the tomograms. It provides enhancements in three areas: (1) improved visualisation that makes connections as well as intensity differences between and within objects or structures easier to see and interpret, (2) interactive transfer function editing with direct visual result feedback using both piecewise linear functions and Gaussian function elements, (3) computationally optimised template matching and tools to visually assess and interactively explore the correlation results. The visualisation capabilities and features of ProViz are demonstrated on various biological volume data-sets: bacterial filament structures in vitro, a desmosome and the transmembrane cadherin connections therein in situ, and liposomes filled with doxorubicin in solution. The explorative template matching is demonstrated on a synthetic IgG data-set.

Keywords
Electron tomography, direct volume rendering, image registration, connected component filtering, visualisation and analysis software
National Category
Bioinformatics (Computational Biology)
Identifiers
urn:nbn:se:uu:diva-359635 (URN)10.1080/21681163.2016.1154483 (DOI)000428130400009 ()
Funder
Swedish Foundation for Strategic Research VINNOVA
Available from: 2018-09-05 Created: 2018-09-05 Last updated: 2018-09-05Bibliographically approved
Ayyalasomayajula, K. R. & Brun, A. (2017). Semantic Labeling using Convolutional Networks coupled with Graph-Cuts for Document binarization. In: : . Paper presented at First Swedish Symposium on Deep Learning (SSDL 2017).
Open this publication in new window or tab >>Semantic Labeling using Convolutional Networks coupled with Graph-Cuts for Document binarization
2017 (English)Conference paper, Poster (with or without abstract) (Other academic)
National Category
Computer Vision and Robotics (Autonomous Systems)
Research subject
Computerized Image Processing
Identifiers
urn:nbn:se:uu:diva-336174 (URN)
Conference
First Swedish Symposium on Deep Learning (SSDL 2017)
Funder
Swedish Research Council, 2012-5743Riksbankens Jubileumsfond, NHS14-2068:1
Available from: 2017-12-12 Created: 2017-12-12 Last updated: 2018-01-13Bibliographically approved
Wahlberg, F., Mårtensson, L. & Brun, A. (2016). Estimating manuscript production dates using both image and language data. In: Proceedings of SSBA, 2016: . Paper presented at SSBA Symposium 2016.
Open this publication in new window or tab >>Estimating manuscript production dates using both image and language data
2016 (English)In: Proceedings of SSBA, 2016, 2016Conference paper, Published paper (Other academic)
National Category
Computer Vision and Robotics (Autonomous Systems)
Research subject
Computerized Image Processing
Identifiers
urn:nbn:se:uu:diva-301163 (URN)
Conference
SSBA Symposium 2016
Projects
q2bq2b_vr2012
Funder
Swedish Research Council, 2012-5743
Available from: 2016-08-18 Created: 2016-08-18 Last updated: 2018-01-10
Ayyalasomayajula, K. R., Nettelblad, C. & Brun, A. (2016). Feature evaluation for handwritten character recognition with regressive and generative Hidden Markov Models. In: Advances in Visual Computing: Part I. Paper presented at ISVC 2016, December 12–14, Las Vegas, NV (pp. 278-287). Springer
Open this publication in new window or tab >>Feature evaluation for handwritten character recognition with regressive and generative Hidden Markov Models
2016 (English)In: Advances in Visual Computing: Part I, Springer, 2016, p. 278-287Conference paper, Published paper (Refereed)
Place, publisher, year, edition, pages
Springer, 2016
Series
Lecture Notes in Computer Science ; 10072
National Category
Computer Vision and Robotics (Autonomous Systems)
Research subject
Computerized Image Processing
Identifiers
urn:nbn:se:uu:diva-308662 (URN)10.1007/978-3-319-50835-1_26 (DOI)978-3-319-50834-4 (ISBN)
Conference
ISVC 2016, December 12–14, Las Vegas, NV
Projects
q2b – From Quill to Bytes
Available from: 2016-12-10 Created: 2016-11-29 Last updated: 2019-03-19Bibliographically approved
Organisations
Identifiers
ORCID iD: ORCID iD iconorcid.org/0000-0002-4405-6888

Search in DiVA

Show all publications