uu.seUppsala universitets publikationer
Ändra sökning
RefereraExporteraLänk till posten
Permanent länk

Direktlänk
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association
  • vancouver
  • Annat format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annat språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf
Learning based segmentation and generation methods for handwritten document images
Uppsala universitet, Teknisk-naturvetenskapliga vetenskapsområdet, Matematisk-datavetenskapliga sektionen, Institutionen för informationsteknologi, Avdelningen för visuell information och interaktion. Uppsala universitet, Teknisk-naturvetenskapliga vetenskapsområdet, Matematisk-datavetenskapliga sektionen, Institutionen för informationsteknologi, Bildanalys och människa-datorinteraktion. (Bildanalys och människa-datorinteraktion, Computerized Image Analysis and Human-Computer Interaction)
2019 (Engelska)Doktorsavhandling, sammanläggning (Övrigt vetenskapligt)
Abstract [en]

Computerized analysis of handwritten documents is an active research area in image analysis and computer vision. The goal is to create tools that can be available for use at university libraries and for researchers in the humanities. Working with large collections of handwritten documents is very time consuming and many old books and letters remain unread for centuries. Efficient computerized methods could help researchers in history, philology and computer linguistics to cost-effectively conduct a whole new type of research based on large collections of documents. The thesis makes a contribution to this area through the development of methods based on machine learning. The passage of time degrades historical documents. Humidity, stains, heat, mold and natural aging of the materials for hundreds of years make the documents increasingly difficult to interpret. The first half of the dissertation is therefore focused on cleaning the visual information in these documents by image segmentation methods based on energy minimization and machine learning. However, machine learning algorithms learn by imitating what is expected of them. One prerequisite for these methods to work is that ground truth is available. This causes a problem for historical documents because there is a shortage of experts who can help to interpret and interpret them. The second part of the thesis is therefore about automatically creating synthetic documents that are similar to handwritten historical documents. Because they are generated from a known text, they have a given facet. The visual content of the generated historical documents includes variation in the writing style and also imitates degradation factors to make the images realistic. When machine learning is trained on synthetic images of handwritten text, with a known facet, in many cases they can even give an even better result for real historical documents.

Ort, förlag, år, upplaga, sidor
Uppsala: Acta Universitatis Upsaliensis, 2019. , s. 97
Serie
Digital Comprehensive Summaries of Uppsala Dissertations from the Faculty of Science and Technology, ISSN 1651-6214 ; 1783
Nyckelord [en]
Machine learning, handwriting, handwritten document anlysis, deep learning, image processing
Nationell ämneskategori
Datorsystem
Forskningsämne
Datoriserad bildbehandling
Identifikatorer
URN: urn:nbn:se:uu:diva-379636ISBN: 978-91-513-0599-8 (tryckt)OAI: oai:DiVA.org:uu-379636DiVA, id: diva2:1297042
Disputation
2019-05-08, TLS, Carolina Rediviva Library, Dag Hammarskjölds Väg 1, Uppsala, 09:00 (Engelska)
Opponent
Handledare
Tillgänglig från: 2019-04-15 Skapad: 2019-03-19 Senast uppdaterad: 2019-06-17Bibliografiskt granskad
Delarbeten
1. Document binarization using topological clustering guided Laplacian Energy Segmentation
Öppna denna publikation i ny flik eller fönster >>Document binarization using topological clustering guided Laplacian Energy Segmentation
2014 (Engelska)Ingår i: Proceedings International Conference on Frontiers in Handwriting Recognition (ICFHR), 2014, 2014, s. 523-528Konferensbidrag, Publicerat paper (Refereegranskat)
Abstract [en]

The current approach for text binarization proposesa clustering algorithm as a preprocessing stage toan energy-based segmentation method. It uses a clusteringalgorithm to obtain a coarse estimate of the background (BG)and foreground (FG) pixels. These estimates are used as a priorfor the source and sink points of a graph cut implementation,which is used to efficiently find the minimum energy solution ofan objective function to separate the BG and FG. The binaryimage thus obtained is used to refine the edge map that guidesthe graph cut algorithm. A final binary image is obtained byonce again performing the graph cut guided by the refinededges on a Laplacian of the image.

Serie
Frontiers in Handwriting Recognition, ISSN 2167-6445 ; 14
Nyckelord
Image Processing; Classification; Machine Learning; Graph-theoretic methods.
Nationell ämneskategori
Datorsystem Signalbehandling
Forskningsämne
Datavetenskap
Identifikatorer
urn:nbn:se:uu:diva-238316 (URN)10.1109/ICFHR.2014.94 (DOI)978-1-4799-4335-7 (ISBN)
Konferens
International Conference on Frontiers in Handwriting Recognition (ICFHR),September 1-4, 2014, Crete, Greece.
Forskningsfinansiär
Vetenskapsrådet, 2012-5743
Tillgänglig från: 2014-12-11 Skapad: 2014-12-11 Senast uppdaterad: 2019-03-19Bibliografiskt granskad
2. Historical document binarization combining semantic labeling and graph cuts
Öppna denna publikation i ny flik eller fönster >>Historical document binarization combining semantic labeling and graph cuts
2017 (Engelska)Ingår i: Image Analysis: Part I, Springer, 2017, s. 386-396Konferensbidrag, Publicerat paper (Refereegranskat)
Abstract [en]

Most data mining applications on collections of historical documents require binarization of the digitized images as a pre-processing step. Historical documents are often subjected to degradations such as parchment aging, smudges and bleed through from the other side. The text is sometimes printed, but more often handwritten. Mathematical modeling of appearance of the text, background and all kinds of degradations, is challenging. In the current work we try to tackle binarization as pixel classification problem. We first apply semantic segmentation, using fully convolutional neural networks. In order to improve the sharpness of the result, we then apply a graph cut algorithm. The labels from the semantic segmentation are used as approximate estimates of the text and background, with the probability map of background used for pruning the edges in the graph cut. The results obtained show significant improvement over the state of the art approach.

Ort, förlag, år, upplaga, sidor
Springer, 2017
Serie
Lecture Notes in Computer Science, ISSN 0302-9743 ; 10269
Nationell ämneskategori
Datorseende och robotik (autonoma system)
Forskningsämne
Datoriserad bildbehandling
Identifikatorer
urn:nbn:se:uu:diva-335335 (URN)10.1007/978-3-319-59126-1_32 (DOI)000454359300032 ()978-3-319-59125-4 (ISBN)
Konferens
SCIA 2017, June 12–14, Tromsø, Norway
Forskningsfinansiär
Vetenskapsrådet, 2012-5743Riksbankens Jubileumsfond, NHS14-2068:1
Tillgänglig från: 2017-05-19 Skapad: 2017-12-04 Senast uppdaterad: 2019-03-19Bibliografiskt granskad
3. PDNet: Semantic segmentation integrated with a primal-dual network for document binarization
Öppna denna publikation i ny flik eller fönster >>PDNet: Semantic segmentation integrated with a primal-dual network for document binarization
2019 (Engelska)Ingår i: Pattern Recognition Letters, ISSN 0167-8655, E-ISSN 1872-7344, Vol. 121, s. 52-60Artikel i tidskrift (Refereegranskat) Published
Nationell ämneskategori
Datorseende och robotik (autonoma system)
Forskningsämne
Datoriserad bildbehandling
Identifikatorer
urn:nbn:se:uu:diva-366933 (URN)10.1016/j.patrec.2018.05.011 (DOI)000459876700008 ()
Forskningsfinansiär
Vetenskapsrådet, 2012-5743Riksbankens Jubileumsfond, NHS14-2068:1
Tillgänglig från: 2018-05-16 Skapad: 2018-11-27 Senast uppdaterad: 2019-04-04Bibliografiskt granskad
4. Feature evaluation for handwritten character recognition with regressive and generative Hidden Markov Models
Öppna denna publikation i ny flik eller fönster >>Feature evaluation for handwritten character recognition with regressive and generative Hidden Markov Models
2016 (Engelska)Ingår i: Advances in Visual Computing: Part I, Springer, 2016, s. 278-287Konferensbidrag, Publicerat paper (Refereegranskat)
Ort, förlag, år, upplaga, sidor
Springer, 2016
Serie
Lecture Notes in Computer Science ; 10072
Nationell ämneskategori
Datorseende och robotik (autonoma system)
Forskningsämne
Datoriserad bildbehandling
Identifikatorer
urn:nbn:se:uu:diva-308662 (URN)10.1007/978-3-319-50835-1_26 (DOI)978-3-319-50834-4 (ISBN)
Konferens
ISVC 2016, December 12–14, Las Vegas, NV
Projekt
q2b – From Quill to Bytes
Tillgänglig från: 2016-12-10 Skapad: 2016-11-29 Senast uppdaterad: 2019-03-19Bibliografiskt granskad
5. CalligraphyNet: Augmenting handwriting generation with quill based stroke width
Öppna denna publikation i ny flik eller fönster >>CalligraphyNet: Augmenting handwriting generation with quill based stroke width
2019 (Engelska)Manuskript (preprint) (Övrigt vetenskapligt)
Abstract [en]

Realistic handwritten document generation garners a lot ofinterest from the document research community for its abilityto generate annotated data. In the current approach we haveused GAN-based stroke width enrichment and style transferbased refinement over generated data which result in realisticlooking handwritten document images. The GAN part of dataaugmentation transfers the stroke variation introduced by awriting instrument onto images rendered from trajectories cre-ated by tracking coordinates along the stylus movement. Thecoordinates from stylus movement are augmented with thelearned stroke width variations during the data augmentationblock. An RNN model is then trained to learn the variationalong the movement of the stylus along with the stroke varia-tions corresponding to an input sequence of characters. Thismodel is then used to generate images of words or sentencesgiven an input character string. A document image thus cre-ated is used as a mask to transfer the style variations of the inkand the parchment. The generated image can capture the colorcontent of the ink and parchment useful for creating annotated data.

Nationell ämneskategori
Datorsystem
Forskningsämne
Datoriserad bildbehandling
Identifikatorer
urn:nbn:se:uu:diva-379633 (URN)
Konferens
26th IEEE International Conference on Image Processing
Anmärkning

Currently under review

Tillgänglig från: 2019-03-19 Skapad: 2019-03-19 Senast uppdaterad: 2019-04-08

Open Access i DiVA

fulltext(2990 kB)196 nedladdningar
Filinformation
Filnamn FULLTEXT01.pdfFilstorlek 2990 kBChecksumma SHA-512
9a65355f9cdb468e9cd5c7bff804c2c8a6cdaac5249d00aa8227954897cb5a92d58b41ab4ccc1005239f58c87f4a18f34f2926af7a95e7e3ed3c7aa2b4805915
Typ fulltextMimetyp application/pdf
Köp publikationen >>

Personposter BETA

Ayyalasomayajula, Kalyan Ram

Sök vidare i DiVA

Av författaren/redaktören
Ayyalasomayajula, Kalyan Ram
Av organisationen
Avdelningen för visuell information och interaktionBildanalys och människa-datorinteraktion
Datorsystem

Sök vidare utanför DiVA

GoogleGoogle Scholar
Totalt: 196 nedladdningar
Antalet nedladdningar är summan av nedladdningar för alla fulltexter. Det kan inkludera t.ex tidigare versioner som nu inte längre är tillgängliga.

isbn
urn-nbn

Altmetricpoäng

isbn
urn-nbn
Totalt: 531 träffar
RefereraExporteraLänk till posten
Permanent länk

Direktlänk
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association
  • vancouver
  • Annat format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annat språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf