Multi Cloud NLP Service

Be part of a team to create a multicloud natural language processing service.

Your goal will be to develop an API, secure REST, and command line tool that easily interfaces with natural language services of multiple cloud providers. Your integrated service will utilize all or one of them to achieve a task related to NLP analysis.

This is especially useful for data scientists that may want to access multiple cloud providers and eliminate vendor lock-in or to access services that other providers do not offer.

Deliverables

You will be developing

  1. A comparison of NLP cloud services.

    Distinguish them by characteristics, create a table

  2. A command line interface to the service

  3. A Rest API that calls out other services

  4. A Python API that wraps several services into a convenient library

  5. A manual describing the functionality

  6. Put everything in a container so it can be run on Linux, Mac and Windows.

  7. Create a convenient command line tool that allows starting the service, interacting with it, and making this really easy to use. THe command line will hide the docker commands while providing human readable abbreviations.

  8. Deliver unit tests with pytests

  9. Deliver a high-quality report including benchmarks

  10. Integrate authentication to the cloud providers and to the REST service.

  11. Use Yaml for the configuration of the service

  12. Do the development in a container using 20.04. We will create a DOckerfile

  13. The code will be developed in GitHub at cloudmesh-nlp, which will be set up by Gregor

Requirements

In order for you to participate in this project, you will need:

  1. A computer on which you can run docker (Windows Home will not work)
  2. Significant python knowledge
  3. Be highly motivated
  4. Be willing to have meetings on this project once or twice a week
  5. Showcase significant progress over the lifetime of the project.
  6. Be knowledgeable with GitHub (a repository will be provided to which Dr. von Laszewski will contribute)
  7. Conduct task management in GitHub (Gregor will explain)
  8. Be honest and not hiding problems or implementation bugs.
  9. You must be able to do a videoconference and be able to share your screen (I typically use google meet or zoom).

Getting started

  1. start with a Survey
  2. design the command line interface first as that may be the easiest and will showcase how to design the API

In case of questions, please contact Gregor at

laszewski@gmail.com

Gregor von Laszewski
Gregor von Laszewski
Research Professor

My research interests include distributed robotics, mobile computing and programmable matter.

Related