uu.seUppsala universitets publikasjoner
Endre søk
RefereraExporteraLink to record
Permanent link

Direct link
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association
  • vancouver
  • Annet format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annet språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf
CalligraphyNet: Augmenting handwriting generation with quill based stroke width
Uppsala universitet, Teknisk-naturvetenskapliga vetenskapsområdet, Matematisk-datavetenskapliga sektionen, Institutionen för informationsteknologi, Avdelningen för visuell information och interaktion. (Bildanalys och människa-datorinteraktion, Computerized Image Analysis and Human-Computer Interaction)
Uppsala universitet, Teknisk-naturvetenskapliga vetenskapsområdet, Matematisk-datavetenskapliga sektionen, Institutionen för informationsteknologi, Avdelningen för visuell information och interaktion. Uppsala universitet, Teknisk-naturvetenskapliga vetenskapsområdet, Matematisk-datavetenskapliga sektionen, Institutionen för informationsteknologi, Bildanalys och människa-datorinteraktion.
Uppsala universitet, Teknisk-naturvetenskapliga vetenskapsområdet, Matematisk-datavetenskapliga sektionen, Institutionen för informationsteknologi, Bildanalys och människa-datorinteraktion. Uppsala universitet, Teknisk-naturvetenskapliga vetenskapsområdet, Matematisk-datavetenskapliga sektionen, Institutionen för informationsteknologi, Avdelningen för visuell information och interaktion. Uppsala universitet, Medicinska och farmaceutiska vetenskapsområdet, Medicinska fakulteten, Institutionen för kirurgiska vetenskaper, Radiologi.
Uppsala universitet, Teknisk-naturvetenskapliga vetenskapsområdet, Matematisk-datavetenskapliga sektionen, Institutionen för informationsteknologi, Bildanalys och människa-datorinteraktion. Uppsala universitet, Teknisk-naturvetenskapliga vetenskapsområdet, Matematisk-datavetenskapliga sektionen, Institutionen för informationsteknologi, Avdelningen för visuell information och interaktion.ORCID-id: 0000-0002-4405-6888
2019 (engelsk)Manuskript (preprint) (Annet vitenskapelig)
Abstract [en]

Realistic handwritten document generation garners a lot ofinterest from the document research community for its abilityto generate annotated data. In the current approach we haveused GAN-based stroke width enrichment and style transferbased refinement over generated data which result in realisticlooking handwritten document images. The GAN part of dataaugmentation transfers the stroke variation introduced by awriting instrument onto images rendered from trajectories cre-ated by tracking coordinates along the stylus movement. Thecoordinates from stylus movement are augmented with thelearned stroke width variations during the data augmentationblock. An RNN model is then trained to learn the variationalong the movement of the stylus along with the stroke varia-tions corresponding to an input sequence of characters. Thismodel is then used to generate images of words or sentencesgiven an input character string. A document image thus cre-ated is used as a mask to transfer the style variations of the inkand the parchment. The generated image can capture the colorcontent of the ink and parchment useful for creating annotated data.

sted, utgiver, år, opplag, sider
2019.
HSV kategori
Forskningsprogram
Datoriserad bildbehandling
Identifikatorer
URN: urn:nbn:se:uu:diva-379633OAI: oai:DiVA.org:uu-379633DiVA, id: diva2:1297041
Konferanse
26th IEEE International Conference on Image Processing
Merknad

Currently under review

Tilgjengelig fra: 2019-03-19 Laget: 2019-03-19 Sist oppdatert: 2019-04-08
Inngår i avhandling
1. Learning based segmentation and generation methods for handwritten document images
Åpne denne publikasjonen i ny fane eller vindu >>Learning based segmentation and generation methods for handwritten document images
2019 (engelsk)Doktoravhandling, med artikler (Annet vitenskapelig)
Abstract [en]

Computerized analysis of handwritten documents is an active research area in image analysis and computer vision. The goal is to create tools that can be available for use at university libraries and for researchers in the humanities. Working with large collections of handwritten documents is very time consuming and many old books and letters remain unread for centuries. Efficient computerized methods could help researchers in history, philology and computer linguistics to cost-effectively conduct a whole new type of research based on large collections of documents. The thesis makes a contribution to this area through the development of methods based on machine learning. The passage of time degrades historical documents. Humidity, stains, heat, mold and natural aging of the materials for hundreds of years make the documents increasingly difficult to interpret. The first half of the dissertation is therefore focused on cleaning the visual information in these documents by image segmentation methods based on energy minimization and machine learning. However, machine learning algorithms learn by imitating what is expected of them. One prerequisite for these methods to work is that ground truth is available. This causes a problem for historical documents because there is a shortage of experts who can help to interpret and interpret them. The second part of the thesis is therefore about automatically creating synthetic documents that are similar to handwritten historical documents. Because they are generated from a known text, they have a given facet. The visual content of the generated historical documents includes variation in the writing style and also imitates degradation factors to make the images realistic. When machine learning is trained on synthetic images of handwritten text, with a known facet, in many cases they can even give an even better result for real historical documents.

sted, utgiver, år, opplag, sider
Uppsala: Acta Universitatis Upsaliensis, 2019. s. 97
Serie
Digital Comprehensive Summaries of Uppsala Dissertations from the Faculty of Science and Technology, ISSN 1651-6214 ; 1783
Emneord
Machine learning, handwriting, handwritten document anlysis, deep learning, image processing
HSV kategori
Forskningsprogram
Datoriserad bildbehandling
Identifikatorer
urn:nbn:se:uu:diva-379636 (URN)978-91-513-0599-8 (ISBN)
Disputas
2019-05-08, TLS, Carolina Rediviva Library, Dag Hammarskjölds Väg 1, Uppsala, 09:00 (engelsk)
Opponent
Veileder
Tilgjengelig fra: 2019-04-15 Laget: 2019-03-19 Sist oppdatert: 2019-06-17bibliografisk kontrollert
2. Learning based Word Search and Visualisation for Historical Manuscript Images
Åpne denne publikasjonen i ny fane eller vindu >>Learning based Word Search and Visualisation for Historical Manuscript Images
2019 (engelsk)Doktoravhandling, med artikler (Annet vitenskapelig)
Abstract [en]

Today, work with historical manuscripts is nearly exclusively done manually, by researchers in the humanities as well as laypeople mapping out their personal genealogy. This is a highly time consuming endeavour as it is not uncommon to spend months with the same volume of a few hundred pages. The last few decades have seen an ongoing effort to digitise manuscripts, both preservation purposes and to increase accessibility. This has the added effect of enabling the use methods and algorithms from Image Analysis and Machine Learning that have great potential in both making existing work more efficient and creating new methodologies for manuscript-based research.

The first part of this thesis focuses on Word Spotting, the task of searching for a given text query in a manuscript collection. This can be broken down into two tasks, detecting where the words are located on the page, and then ranking the words according to their similarity to a search query. We propose Deep Learning models to do both, separately and then simultaneously, and successfully search through a large manuscript collection consisting of over a hundred thousand pages.

A limiting factor in applying learning-based methods to historical manuscript images is the cost, and therefore, lack of annotated data needed to train machine learning models. We propose several ways to mitigate this problem, including generating synthetic data, augmenting existing data to get better value from it, and learning from pre-existing, partially annotated data that was previously unusable.

In the second part, a method for visualising manuscript collections called the Image-based Word Cloud is proposed. Much like it text-based counterpart, it arranges the most representative words in a collection into a cloud, where the size of the words are proportional to their frequency of occurrence. This grants a user a single image overview of a manuscript collection, regardless of its size. We further propose a way to estimate a manuscripts production date. This can grant historians context that is crucial for correctly interpreting the contents of a manuscript.

sted, utgiver, år, opplag, sider
Uppsala: Acta Universitatis Upsaliensis, 2019. s. 82
Serie
Digital Comprehensive Summaries of Uppsala Dissertations from the Faculty of Science and Technology, ISSN 1651-6214 ; 1798
Emneord
Word Spotting, Convolutional Neural Networks, Deep Learning, Region Proposals, Historical Manuscripts, Computer Vision, Image Analysis, Visualisation, Document Analysis
HSV kategori
Forskningsprogram
Datoriserad bildbehandling
Identifikatorer
urn:nbn:se:uu:diva-381308 (URN)978-91-513-0633-9 (ISBN)
Disputas
2019-06-04, TLS (Tidskriftläsesalen), Carolina Rediviva, Dag Hammarskjölds väg 1, Uppsala, 10:15 (engelsk)
Opponent
Veileder
Forskningsfinansiär
Swedish Research Council, 2012-5743Riksbankens Jubileumsfond, NHS14-2068:1
Tilgjengelig fra: 2019-05-13 Laget: 2019-04-08 Sist oppdatert: 2019-06-18

Open Access i DiVA

Fulltekst mangler i DiVA

Personposter BETA

Wilkinson, TomasMalmberg, FilipBrun, Anders

Søk i DiVA

Av forfatter/redaktør
Ayyalasomayajula, Kalyan RamWilkinson, TomasMalmberg, FilipBrun, Anders
Av organisasjonen

Søk utenfor DiVA

GoogleGoogle Scholar

urn-nbn

Altmetric

urn-nbn
Totalt: 188 treff
RefereraExporteraLink to record
Permanent link

Direct link
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association
  • vancouver
  • Annet format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annet språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf