Commit 9b08277c authored by Aviparna Biswas's avatar Aviparna Biswas
Browse files

Update README.md

parent d7bc9f12
# Doccano
# Labeling text using Doccano
Labeling text using Doccano
\ No newline at end of file
Doccano is an open source text annotation tool. It can be used to create labeled datasets for:
- Text classification
- Entity extraction
- Sequence to sequence translation
Doccano can be used to create labeled data for training the `EntityRecongnizer` model in `arcgis.learn`.
This software is created by: Hiroki Nakayama and Takahiro Kubo and Junya Kamura and Yasufumi Taniguchi and Xu Liang
## How to label training data for named entity recognition with doccano
1. After Doccano has been deployed to the local machine, go to Doccano hompage and login with your credentials.
2. Select appropriate project type
3. If data import needed for annotation, go to Dataset from the left panel then click on Actions > Import dataset.
4. Select 'JSONL' and then click on 'Select file(s)' and point it to the reports file (docanno_deployment\reports_label.jsonl). **Alternatively, text documents can also be uploaded using the ‘Plain text’ option.**
5. After the file has been imported, you will see the documents loaded on the screen.
6. Click on 'Start annotation' from the top menu bar.
7. Analyze the document (use the bottom navigation bar for shifting through the docs). Mark sequences with your mouse and select the relevant title.
8. New labels can also be created by navigating to ‘Labels’ from the left panel.
9. Once all the documents have been labeled, go to 'Dataset' > 'Actions' > 'Export dataset'.
10. Select JSONL(Text-Labels).
11. Set an export file name.
12. Click Export.
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment