Coding for Machine Learning
Hi, I am Richa Sinha. I completed my Bachelors in Electronics and Communication Engineering from NIT Jamshedpur, India in 2012 and moved to the USA in 2016 to pursue a Masters in Computer Engineering. I worked for a little over 3.5 years in India in the field of computer communication and networking before my masters.
During my bachelors, I did not learn coding as that was absent in my curriculum. Later, I started learning coding and basic fundamentals of computer science because of my work. I started as a Software developer with Alcatel Lucent IP Division (now known as Nokia IPD). I was responsible for designing the IPv6 control as well as data plane for their service routers. The whole infrastructure was written in C/C++ and I had to design the IPv6 infrastructure over that. It also required a good understanding of Operating Systems.
Since I had no background in Computer Science, I started taking certain online courses on Coursera to read and develop my programming skills. It also helped me in improving my knowledge of data structures and algorithms. It intrigued me so much that I started freelancing and learning new programming languages like Python and JavaScript. I also tried developing games using vPython. In my free time, I started reading various technical blogs like Slashdot, hacker news and various subreddits related to Computer Science. All of these piqued my interest in the field of computer science in general and I decided to pursue my masters. I got my MS from Virginia Tech with a focus in Computer Vision. During my masters, I interned at Amazon and after graduation, I joined Amazon as a full time software developer.
What do you work on?
I currently work as a Software Developer Engineer with Amazon. I have been a SDE with Amazon for a little over two years and have been with three different orgs during this time period. I interned and started as a full time engineer in the Alexa Skills Kit development org. As an intern, I had an amazing time where I worked on creating and hosting various skills in Alexa. Geolocation was still not supported back then and I worked on prototyping geolocation usage with Alexa. Prior to the geolocation feature, Alexa could only take in a single device address (a static address). In order to incorporate Alexa with vehicles or phones (a moving device with changing address), it was needed to continuously update the location of the device. During my internship, I created an Alexa out of a raspberry pi (called Alexa Voice Service). The raspberry pi acted as a moving device and kept on sending its geolocation. I wrote Alexa skills which could use the geolocation coordinates.
Post Alexa, I spent around 1.5 years in Amazon Advertising. Here, I worked on low latency distributed cache solutions to support relevant ads to the search and detail page of Alexa (sponsored products). Whenever a customer searches anything on an Amazon web page, s/he sees various results, some with a tag called “sponsored”. These are advertisements on the Amazon page. A lot of machine learning goes on in the background to ensure that ads on Amazon are as intriguing as a non-advertised product. The latency of Amazon search pages is always very low to provide a seamless customer experience. The infrastructure requires us to understand the query and return desired results within a few milli-seconds. I wrote various applications in Java to support the end to end functioning of this workflow. Our ad server hosts many machine learning models to return relevant results to the customer.
I also worked on designing an application which could help in managing the whole model development and deployment life cycle. I have recently moved out of Ads to work in another part of the Amazon retail page team. Now, I work as a software developer for machine learning (ML) with the Econ Tech team at Amazon. Here, I work on managing and designing systems that could help in productionalizing a machine learning model. A ML model needs various data processing and data validation jobs. These are the first steps in generating a ML model. It is called the feature generation step. As a machine learning engineer (MLE) in my current team, I am responsible for automating these feature generation jobs. My team uses Scala in Spark framework to generate the features. As MLEs, we also investigate ways to automate the ML training, testing and deployment pipelines. We continuously work on Open source tools or AWS products to help us with the automation. Amazon is investing in Auto-ML and as MLEs we are working towards automating the process of generating a ML model.
How do you use coding in your projects?
Coding is used everywhere. Be it writing a large scaled application or even a small part that could help in speeding up the development process (automation). In my 2 years at Amazon, I have programmed in Java, JavaScript, Ruby, Scala and Python. Most of the projects go through the normal software development life cycle model where there is a design review. Many senior engineers review the requirements and the proposed design for the application. Once approved, a software developer implements the design through the code.
As an engineer, we are encouraged to apply various software engineering principles and write modular code. This helps in extension of the architecture if needed in the future. Most of the teams at Amazon follow the SOLID principles for coding. A programmer is encouraged to write testable code. Keeping test cases in mind while writing code also makes them a lot more extensible. Programming is even used to manage our infrastructure. If there is a certain component within our systems that is manual and takes a long time, we automate them programmatically. Various teams at Amazon also hold Lunch and Learn sessions, where various senior and junior developers interact and talk about coding fundamentals. Each and every programming language has a standard of their own and as programmers, we are encouraged to learn them more.
Any fun insights or tips related to the application of computer science in your area.
Coding takes away a lot of manual work.
At big companies like Amazon or Google, engineers program the way the voice assistant listens.
On retail websites, the engineers are trying to enhance the shopper’s experience by showing them what they want. Platforms like YouTube and Facebook Video are enabling us to share high quality content all across the world. Each of these applications has CDN and a high powered distributed infrastructure support. The networking infrastructure makes the communication better.
There is a lot of engineering and machine learning going in the background in order to better serve customers. These days, coding is everywhere. An english major might want to learn it to get an insight of various scripts. That will come under natural language processing. A painter or a graphics/3D programmer might want to use it. Hollywood studios use it to reduce the cost of going outdoors. Various coding languages have evolved through time to provide such features. And the best and the worst part of Computer Science is that each and every part of Computer Science keeps evolving now and then. Anyone can learn programming and perform wonders in this area.
List of tools and programming languages you use
- Python
- Scala
- Java
- JavaScript
- Ruby
- C
- C++
I consider myself a digital polyglot who has to Google syntaxes a lot of times.
Apart from being a software developer, Richa is a self-taught painter who enjoys learning all sorts of mediums (acrylic, oil, water color, gouache, etc.). She also does digital art on her iPad at times and is trying to learn shader programming which is used to create graphics. You can check out her work on Instagram (@painter_wizard) and on Github (@richasinha).