Gen AI: One Down, One To Go

Where most of the work gets done.

Just like that, the introductory course to Gen. AI is complete. In all honesty, it was not what I had anticipated when I started the course. Instead it had focused heavily on prompt design and parameter manipulation. Through experimentation the course had me alter variables, retest, then analyze results to understand each parameter's influence on output. Though it wasn't what I had expected, it still kept me thinking.

The course guided me through the Vertex AI API, a unified interface for interacting with all the different Gemini Models. It also demonstrated how to implement Gemini Models in Python via the API. This is where things started to get a little more interesting. With my newfound knowledge of prompt design and parameter distributions, I now have the ability to integrate Gemini Models into a program of my own. Do I have a project in mind? Not in the slightest, but it’s reassuring to know that when an idea strikes, I’ll be ready to apply what I’ve learned from this course.

Having finished the beginner pathway, I quickly transitioned on to the advanced pathway. This began with the short courses, Introduction to Image Generation and Attention Mechanism. The former having taught me about Diffusion Models, which draw inspiration from thermodynamics and aim to use one neural network to train another. The first network iteratively adds noise to an image, while the other network works to iteratively reduce the noise in the image. The latter course introduced a method of translation between languages (at least that was the use-case in this scenario.) Given the sentence, "The black cat sat." as an input to a translator, the translator may focus too much on the word "black" rather than "cat," potentially causing a translation error. This is where the Attention Mechanism plays its role. The model assigns weights t0 each of the words using probability distributions and scores of which are generated by hidden layers. Applying these scalars to words and phrases, the model is able to generate a much more accurate translation each time.

This leads to my current position in the pathway: Encoder-Decoder Architecture. This is probably the first course in the series where I have been left confused, even after rewatching a couple times. In this course, we dive into the Python integration and are guided through the process of creating a Recurrent Neural Network (RNN). I am still digesting the material, so I will refrain from explaining what I have understood so far because I fear I may be incorrect. Either way, I will strive to finish and confidently understand the course before the next Gen AI related post.

Since the Google Pathways for Gen AI end after this advanced section, I hope to lay into a YouTube series where the instructor walks through the creation of a neural network from scratch. I plan to work along with his videos to create a tangible project that demonstrates my understanding of basic neural network principles and models.

Until next time!

~ Cyrus Foroudian