Multi Cloud NLP Service
Be part of a team to create a multicloud natural language processing service.
Your goal will be to develop an API, secure REST, and command line tool that easily interfaces with natural language services of multiple cloud providers. Your integrated service will utilize all or one of them to achieve a task related to NLP analysis.
This is especially useful for data scientists that may want to access multiple cloud providers and eliminate vendor lock-in or to access services that other providers do not offer.
You will be developing
A comparison of NLP cloud services.
- Text Summarization
- Twinword Text Analysis
- IBM Watson Alchemy
- Geneea Interpretor NLP
- MLP CLoud
- Natural Language AI https://cloud.google.com/natural-language
- CLoud Factory NLP https://www.cloudfactory.com/services/nlp
- There could be many more
Distinguish them by characteristics, create a table
A command line interface to the service
A Rest API that calls out other services
A Python API that wraps several services into a convenient library
A manual describing the functionality
Put everything in a container so it can be run on Linux, Mac and Windows.
Create a convenient command line tool that allows starting the service, interacting with it, and making this really easy to use. THe command line will hide the docker commands while providing human readable abbreviations.
Deliver unit tests with pytests
Deliver a high-quality report including benchmarks
Integrate authentication to the cloud providers and to the REST service.
Use Yaml for the configuration of the service
Do the development in a container using 20.04. We will create a DOckerfile
The code will be developed in GitHub at cloudmesh-nlp, which will be set up by Gregor
In order for you to participate in this project, you will need:
- A computer on which you can run docker (Windows Home will not work)
- Significant python knowledge
- Be highly motivated
- Be willing to have meetings on this project once or twice a week
- Showcase significant progress over the lifetime of the project.
- Be knowledgeable with GitHub (a repository will be provided to which Dr. von Laszewski will contribute)
- Conduct task management in GitHub (Gregor will explain)
- Be honest and not hiding problems or implementation bugs.
- You must be able to do a videoconference and be able to share your screen (I typically use google meet or zoom).
- start with a Survey
- design the command line interface first as that may be the easiest and will showcase how to design the API
In case of questions, please contact Gregor at