uu.seUppsala universitets publikationer
Ändra sökning
RefereraExporteraLänk till posten
Permanent länk

Direktlänk
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association
  • vancouver
  • Annat format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annat språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf
A machine learning pipeline for predicting success rates in PrEST production
Uppsala universitet, Teknisk-naturvetenskapliga vetenskapsområdet, Biologiska sektionen, Institutionen för biologisk grundutbildning.
2019 (Engelska)Självständigt arbete på avancerad nivå (yrkesexamen), 20 poäng / 30 hpStudentuppsats (Examensarbete)
Abstract [en]

Protein epitope signature tags (PrESTs) are antigens produced in Escherichia coli at Atlas Antibodies and immunized into rabbits for antibody production. This project uses machine learning models to predict success rates for production and immunization and to find features important for success. The features are generated based on the PrEST sequences using web servers, downloadable software and Pyhton scripts. An additional analysis of the effect of rabbit- and environmental features on immunization success is performed. Many different models, model architectures and a few thousand features were tried. The models reached a maximum F1 scores of about 0.55 for a target outcome divided into two classes for both production and immunization analysis. No important features could be identified with significance.

The rabbit- and environmental analysis showed that this type of features is more important for PrEST immunization success than the PrEST-related features. F1 score rose to abut 0.6 and the environmental features ranked higher based on information gain. More data is needed to draw definitive conclusions, but this indicates that Atlas Antibodies should in the future focus on recording environmental features during production for better chances of predicting success rates.

Ort, förlag, år, upplaga, sidor
2019. , s. 84
Serie
UPTEC X ; 19009
Nyckelord [en]
bioinformatik, proteomik, maskininlärning
Nationell ämneskategori
Bioinformatik (beräkningsbiologi)
Identifikatorer
URN: urn:nbn:se:uu:diva-385494OAI: oai:DiVA.org:uu-385494DiVA, id: diva2:1324575
Externt samarbete
Atlas Antibodies
Utbildningsprogram
Civilingenjörsprogrammet i molekylär bioteknik
Handledare
Examinatorer
Anmärkning

Sekretess

Tillgänglig från: 2019-06-14 Skapad: 2019-06-13 Senast uppdaterad: 2019-06-14Bibliografiskt granskad

Open Access i DiVA

Fulltext saknas i DiVA

Av organisationen
Institutionen för biologisk grundutbildning
Bioinformatik (beräkningsbiologi)

Sök vidare utanför DiVA

GoogleGoogle Scholar

urn-nbn

Altmetricpoäng

urn-nbn
Totalt: 138 träffar
RefereraExporteraLänk till posten
Permanent länk

Direktlänk
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association
  • vancouver
  • Annat format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annat språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf