Hello,
This sounds like an interesting project.
I am assuming your project is a web app. If this is the case, I propose building a MEAN single page application. It will use RecordRTC, a Node library that records the microphone in the browser and streams it chunked using a ScriptProcessorNode. The audio can be streamed to the server of your choice. It can set up the server if you wish to host it yourself or I could set it up on Google's Compute Engine or Azure if you are interested in scalability.
I am familiar with this sort of applications. For my Bachelor degree project, I made an application that captured sound from the microphone, streamed it to a Google Compute Engine server where it was converted to text via Google's API, the text was processed via NLTK in order to prepare it for translation and after being translated, speech synthesis was performed with pyttsx and sent back to the client.
Regarding the milestones:
Basic Prototype - Simple app that streams to localhost
Server Setup - Setting up the server for your app
Beta App - Main functionality achieved ( streaming to server ), some bugs, not very fancy looking
Finished product - Polished app, fully functional and optimized
I would like to discuss the project further and hope for a collaboration.