Projektdetaljer
Beskrivelse
Shopgun works by taking images of products and placing them in catalogues. There is a lot of manual work involved here, in both tagging & captioning product images and also in then placing products into abstract hierarchies. This PhD aims to automate these two steps through machine learning, and to enable multi-lingual generation of captions and thus product catalogues.
This requires flexible, ontology based multi-class, multi-label problems models within each available data modality (textual, image & tabular). Text generation from images is a burgeoning area with exciting things happening; using images to create hierarchies, and to cross-lingually generate captions, is both novel of commercial value.
The first key question that we’ll be trying to answer through the research is how best to combine data of different modalities in order to increase the models’ performance, measured via multiple metrics – both direct and indirect (impact on desired user behaviour, measured through events within the company’s core products). In the future the available data will come from more and more modalities and our models need to be able to combine them in beneficial ways.
The second research question is on multi-task training: e.g. multiple classificationand generation tasks. The advanced context here, containing multiple modalities and tasks, is again exciting and intuitively and efficient use of data.
The third and final question is about data preprocessing and labelling in the context of an ever-changing ontology of product categories. ML models need to work well on Multi-class, multi-label problems with targets based on an ontology of categories, and over frequently-updated data; this represents an ambitious challenge in a grounded commercially valuable context.
This requires flexible, ontology based multi-class, multi-label problems models within each available data modality (textual, image & tabular). Text generation from images is a burgeoning area with exciting things happening; using images to create hierarchies, and to cross-lingually generate captions, is both novel of commercial value.
The first key question that we’ll be trying to answer through the research is how best to combine data of different modalities in order to increase the models’ performance, measured via multiple metrics – both direct and indirect (impact on desired user behaviour, measured through events within the company’s core products). In the future the available data will come from more and more modalities and our models need to be able to combine them in beneficial ways.
The second research question is on multi-task training: e.g. multiple classificationand generation tasks. The advanced context here, containing multiple modalities and tasks, is again exciting and intuitively and efficient use of data.
The third and final question is about data preprocessing and labelling in the context of an ever-changing ontology of product categories. ML models need to work well on Multi-class, multi-label problems with targets based on an ontology of categories, and over frequently-updated data; this represents an ambitious challenge in a grounded commercially valuable context.
Akronym | DLGeM |
---|---|
Status | Afsluttet |
Effektiv start/slut dato | 19/08/2019 → 18/11/2022 |
Samarbejdspartnere
- IT-Universitetet i København
- ShopGun (Projektpartner) (leder)
Finansiering
- Innovation Fund Denmark (IFD): 1.072.000,00 kr.
Fingerprint
Udforsk forskningsemnerne, som dette projekt berører. Disse etiketter er oprettet på grundlag af de underliggende bevillinger/legater. Sammen danner de et unikt fingerprint.