{"id":3515,"date":"2026-06-11T09:42:57","date_gmt":"2026-06-11T09:42:57","guid":{"rendered":"https:\/\/www.mhtechin.com\/support\/?p=3515"},"modified":"2026-06-11T09:42:57","modified_gmt":"2026-06-11T09:42:57","slug":"word2vec-glove-and-vector-spaces-how-machines-learn-the-meaning-of-words","status":"publish","type":"post","link":"https:\/\/www.mhtechin.com\/support\/word2vec-glove-and-vector-spaces-how-machines-learn-the-meaning-of-words\/","title":{"rendered":"Word2Vec, GloVe, and Vector Spaces: How Machines Learn the Meaning of Words"},"content":{"rendered":"\n<p class=\"wp-block-paragraph\">Natural Language Processing (NLP) has transformed the way machines interact with human language. From search engines and chatbots to recommendation systems and virtual assistants, NLP powers many of the intelligent applications we use every day.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">However, one fundamental challenge exists: computers do not understand words the way humans do.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">When humans read words such as <em>king<\/em>, <em>queen<\/em>, <em>doctor<\/em>, or <em>hospital<\/em>, we automatically understand their meanings and relationships. Computers, on the other hand, see text as a sequence of characters without any inherent meaning.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">To bridge this gap, researchers developed techniques that convert words into numerical representations while preserving their semantic meaning. These representations are known as <strong>word embeddings<\/strong>.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">In this article, we will explore Vector Spaces, Word2Vec, and GloVe\u2014three foundational concepts that helped machines move beyond simple text processing toward understanding language more intelligently.<\/p>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\" \/>\n\n\n\n<h4 class=\"wp-block-heading\">Why Computers Struggle with Text<\/h4>\n\n\n\n<p class=\"wp-block-paragraph\">Computers are designed to process numbers, not words.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Consider the following sentence:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>I love machine learning.\n<\/code><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\">A human immediately understands its meaning, but a computer sees only a collection of characters:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>&#091;'I', 'love', 'machine', 'learning']\n<\/code><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\">The machine does not understand:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>What &#8220;love&#8221; means<\/li>\n\n\n\n<li>That &#8220;machine learning&#8221; is a field of study<\/li>\n\n\n\n<li>That similar words may share similar meanings<\/li>\n<\/ul>\n\n\n\n<p class=\"wp-block-paragraph\">Traditional machine learning algorithms require numerical input.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">This creates an important question:<\/p>\n\n\n\n<blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\">\n<p class=\"wp-block-paragraph\">How can we convert words into numbers while preserving their meaning?<\/p>\n<\/blockquote>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\" \/>\n\n\n\n<h4 class=\"wp-block-heading\">From Words to Numbers: The Need for Word Embeddings<\/h4>\n\n\n\n<p class=\"wp-block-paragraph\">Early NLP techniques attempted to solve this problem using methods such as:<\/p>\n\n\n\n<h5 class=\"wp-block-heading\">One-Hot Encoding<\/h5>\n\n\n\n<p class=\"wp-block-paragraph\">Example vocabulary:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>&#091;\"cat\", \"dog\", \"car\"]\n<\/code><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\">Representations:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>cat = &#091;1, 0, 0]\ndog = &#091;0, 1, 0]\ncar = &#091;0, 0, 1]\n<\/code><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\">While simple, this approach has major limitations:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>High dimensionality<\/li>\n\n\n\n<li>Sparse vectors<\/li>\n\n\n\n<li>No semantic relationships<\/li>\n<\/ul>\n\n\n\n<p class=\"wp-block-paragraph\">For example:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>cat = &#091;1,0,0]\ndog = &#091;0,1,0]\n<\/code><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\">To a computer, &#8220;cat&#8221; and &#8220;dog&#8221; appear completely unrelated, even though both are animals.<\/p>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\" \/>\n\n\n\n<h4 class=\"wp-block-heading\">What Are Word Embeddings?<\/h4>\n\n\n\n<p class=\"wp-block-paragraph\">Word embeddings are dense numerical vectors that capture semantic meaning.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Instead of representing words as isolated entities, embeddings place similar words close together in a mathematical space.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Example:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>King  \u2192 &#091;0.72, 0.14, 0.89]\nQueen \u2192 &#091;0.70, 0.16, 0.87]\nMan   \u2192 &#091;0.60, 0.22, 0.75]\nWoman \u2192 &#091;0.58, 0.24, 0.73]\n<\/code><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\">Notice how related words receive similar vector representations.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">This allows machines to learn relationships between words rather than simply memorizing them.<\/p>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\" \/>\n\n\n\n<h4 class=\"wp-block-heading\">Understanding Vector Spaces<\/h4>\n\n\n\n<p class=\"wp-block-paragraph\">A vector space is a mathematical environment where words are represented as points.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Instead of viewing words as text, we represent them as vectors.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Imagine a simple 2-dimensional space:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>          Queen\n             *\n             |\n             |\nKing *--------*\n             |\n             |\n             *\n          Woman\n<\/code><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\">In reality, modern embeddings use hundreds of dimensions.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Common dimensions include:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>100<\/li>\n\n\n\n<li>200<\/li>\n\n\n\n<li>300<\/li>\n\n\n\n<li>768<\/li>\n\n\n\n<li>1024<\/li>\n<\/ul>\n\n\n\n<p class=\"wp-block-paragraph\">Each dimension captures different characteristics of language.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">The most important idea is:<\/p>\n\n\n\n<blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\">\n<p class=\"wp-block-paragraph\">Words with similar meanings appear closer together in vector space.<\/p>\n<\/blockquote>\n\n\n\n<p class=\"wp-block-paragraph\">For example:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>Paris  \u2192 France\nBerlin \u2192 Germany\n\nKing   \u2192 Queen\nMan    \u2192 Woman\n<\/code><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\">The relationships between these words can be learned mathematically.<\/p>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\" \/>\n\n\n\n<h4 class=\"wp-block-heading\">Word2Vec<\/h4>\n\n\n\n<h5 class=\"wp-block-heading\">What is Word2Vec?<\/h5>\n\n\n\n<p class=\"wp-block-paragraph\">Word2Vec is one of the most influential word embedding techniques developed by Google in 2013.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Instead of manually defining relationships, Word2Vec learns word representations automatically by analyzing large amounts of text.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">The core idea is simple:<\/p>\n\n\n\n<blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\">\n<p class=\"wp-block-paragraph\">Words appearing in similar contexts tend to have similar meanings.<\/p>\n<\/blockquote>\n\n\n\n<p class=\"wp-block-paragraph\">For example:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>The cat sits on the mat.\n\nThe dog sits on the mat.\n<\/code><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\">Because &#8220;cat&#8221; and &#8220;dog&#8221; appear in similar contexts, Word2Vec learns that they are semantically related.<br>from gensim.models import Word2Vec<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">sentences = [<br>[&#8220;i&#8221;, &#8220;love&#8221;, &#8220;machine&#8221;, &#8220;learning&#8221;],<br>[&#8220;machine&#8221;, &#8220;learning&#8221;, &#8220;is&#8221;, &#8220;fun&#8221;],<br>[&#8220;i&#8221;, &#8220;love&#8221;, &#8220;python&#8221;]<br>]<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">model = Word2Vec(<br>sentences,<br>vector_size=100,<br>window=5,<br>min_count=1,<br>workers=4<br>)<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">print(model.wv[&#8220;machine&#8221;])<br><br><br><\/p>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\" \/>\n\n\n\n<h5 class=\"wp-block-heading\">CBOW (Continuous Bag of Words)<\/h5>\n\n\n\n<p class=\"wp-block-paragraph\">CBOW predicts a target word using surrounding words.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Example:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>The ____ sits on the mat\n<\/code><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\">Context words:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>The, sits, on, the, mat\n<\/code><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\">Target word:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>cat\n<\/code><\/pre>\n\n\n\n<h5 class=\"wp-block-heading\">How CBOW Works<\/h5>\n\n\n\n<ol class=\"wp-block-list\">\n<li>Takes surrounding words as input.<\/li>\n\n\n\n<li>Learns context patterns.<\/li>\n\n\n\n<li>Predicts the missing word.<\/li>\n<\/ol>\n\n\n\n<h5 class=\"wp-block-heading\">Advantages of CBOW<\/h5>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Faster training<\/li>\n\n\n\n<li>Efficient for large datasets<\/li>\n\n\n\n<li>Works well for common words<\/li>\n<\/ul>\n\n\n\n<h5 class=\"wp-block-heading\">Limitations of CBOW<\/h5>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Less effective for rare words<\/li>\n\n\n\n<li>May lose detailed contextual information<\/li>\n<\/ul>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\" \/>\n\n\n\n<h5 class=\"wp-block-heading\">Skip-Gram<\/h5>\n\n\n\n<p class=\"wp-block-paragraph\">Skip-Gram works opposite to CBOW.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Instead of predicting the center word, it predicts surrounding words.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Example:<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Target Word:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>cat\n<\/code><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\">Predicted Context:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>The\nsits\non\nmat\n<\/code><\/pre>\n\n\n\n<h5 class=\"wp-block-heading\">How Skip-Gram Works<\/h5>\n\n\n\n<ol class=\"wp-block-list\">\n<li>Takes one word as input.<\/li>\n\n\n\n<li>Predicts neighboring words.<\/li>\n\n\n\n<li>Learns richer semantic relationships.<\/li>\n<\/ol>\n\n\n\n<h5 class=\"wp-block-heading\">Advantages of Skip-Gram<\/h5>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Better for rare words<\/li>\n\n\n\n<li>Captures semantic relationships effectively<\/li>\n\n\n\n<li>Produces high-quality embeddings<\/li>\n<\/ul>\n\n\n\n<h5 class=\"wp-block-heading\">Limitations of Skip-Gram<\/h5>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Slower training<\/li>\n\n\n\n<li>Requires more computation<\/li>\n<\/ul>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\" \/>\n\n\n\n<h5 class=\"wp-block-heading\">Advantages of Word2Vec<\/h5>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Learns semantic relationships automatically<\/li>\n\n\n\n<li>Produces dense vector representations<\/li>\n\n\n\n<li>Efficient training process<\/li>\n\n\n\n<li>Captures meaningful patterns in language<\/li>\n<\/ul>\n\n\n\n<p class=\"wp-block-paragraph\">Example:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>King - Man + Woman \u2248 Queen\n<\/code><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\">This famous example demonstrates how Word2Vec learns relationships mathematically.<\/p>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\" \/>\n\n\n\n<h5 class=\"wp-block-heading\">Limitations of Word2Vec<\/h5>\n\n\n\n<ul class=\"wp-block-list\">\n<li>One vector per word<\/li>\n\n\n\n<li>Cannot handle multiple meanings effectively<\/li>\n<\/ul>\n\n\n\n<p class=\"wp-block-paragraph\">Example:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>Apple (fruit)\nApple (company)\n<\/code><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\">Both meanings receive the same vector.<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Context awareness is limited<\/li>\n\n\n\n<li>Requires large training datasets<\/li>\n<\/ul>\n\n\n\n<p class=\"wp-block-paragraph\">These limitations motivated the development of improved embedding techniques.<\/p>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\" \/>\n\n\n\n<h4 class=\"wp-block-heading\">GloVe (Global Vectors for Word Representation)<\/h4>\n\n\n\n<h4 class=\"wp-block-heading\">What is GloVe?<\/h4>\n\n\n\n<p class=\"wp-block-paragraph\">GloVe (Global Vectors for Word Representation) is a word embedding model developed by Stanford University.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Unlike Word2Vec, which primarily learns from local context, GloVe combines both:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Local context information<\/li>\n\n\n\n<li>Global word co-occurrence statistics<\/li>\n<\/ul>\n\n\n\n<p class=\"wp-block-paragraph\">This helps generate richer word representations.<\/p>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\" \/>\n\n\n\n<h5 class=\"wp-block-heading\">How GloVe Works<\/h5>\n\n\n\n<p class=\"wp-block-paragraph\">GloVe analyzes how frequently words appear together across an entire corpus.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Example:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>ice appears frequently with:\ncold\nwinter\nsnow\nfreeze\n\nsteam appears frequently with:\nhot\nwater\nheat\nboil\n<\/code><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\">By examining these relationships globally, GloVe learns meaningful vector representations.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Instead of focusing only on neighboring words, it studies broader language patterns.<br><br><br>import gensim.downloader as api<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">glove = api.load(&#8220;glove-wiki-gigaword-100&#8221;)<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">print(glove[&#8220;king&#8221;][:10])<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><\/p>\n\n\n\n<p class=\"wp-block-paragraph\">[<br>(&#8216;machines&#8217;, 0.82),<br>(&#8216;computer&#8217;, 0.79),<br>(&#8216;technology&#8217;, 0.75)<br>]<\/p>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\" \/>\n\n\n\n<h4 class=\"wp-block-heading\">Word2Vec vs GloVe<\/h4>\n\n\n\n<figure class=\"wp-block-table\"><table class=\"has-fixed-layout\"><thead><tr><th>Feature<\/th><th>Word2Vec<\/th><th>GloVe<\/th><\/tr><\/thead><tbody><tr><td>Learning Method<\/td><td>Predictive<\/td><td>Count-Based<\/td><\/tr><tr><td>Context Usage<\/td><td>Local Context<\/td><td>Global Context<\/td><\/tr><tr><td>Training Speed<\/td><td>Fast<\/td><td>Fast<\/td><\/tr><tr><td>Semantic Relationships<\/td><td>Strong<\/td><td>Strong<\/td><\/tr><tr><td>Statistical Information<\/td><td>Limited<\/td><td>Extensive<\/td><\/tr><\/tbody><\/table><\/figure>\n\n\n\n<h3 class=\"wp-block-heading\"><\/h3>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\" \/>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\" \/>\n\n\n\n<h1 class=\"wp-block-heading\">Conclusion<\/h1>\n\n\n\n<p class=\"wp-block-paragraph\">The development of Word2Vec and GloVe marked a significant milestone in Natural Language Processing. By transforming words into meaningful vector representations, these techniques enabled machines to move beyond simple keyword matching and begin understanding relationships between words.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Vector spaces provide a powerful framework where semantic meaning can be represented mathematically, allowing algorithms to identify similarities, analogies, and contextual relationships. Although newer transformer-based models have emerged, Word2Vec and GloVe remain fundamental concepts for understanding how modern NLP evolved.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">As we continue our NLP journey, the next step is learning how these embeddings are compared and utilized in real-world systems. This leads naturally to concepts such as Cosine Similarity, Semantic Similarity, and Vector Space Semantics, which form the foundation of modern semantic search and retrieval systems.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Natural Language Processing (NLP) has transformed the way machines interact with human language. From search engines and chatbots to recommendation systems and virtual assistants, NLP powers many of the intelligent applications we use every day. However, one fundamental challenge exists: computers do not understand words the way humans do. When humans read words such as [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[1],"tags":[],"class_list":["post-3515","post","type-post","status-publish","format-standard","hentry","category-support"],"_links":{"self":[{"href":"https:\/\/www.mhtechin.com\/support\/wp-json\/wp\/v2\/posts\/3515","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.mhtechin.com\/support\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.mhtechin.com\/support\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.mhtechin.com\/support\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.mhtechin.com\/support\/wp-json\/wp\/v2\/comments?post=3515"}],"version-history":[{"count":1,"href":"https:\/\/www.mhtechin.com\/support\/wp-json\/wp\/v2\/posts\/3515\/revisions"}],"predecessor-version":[{"id":3517,"href":"https:\/\/www.mhtechin.com\/support\/wp-json\/wp\/v2\/posts\/3515\/revisions\/3517"}],"wp:attachment":[{"href":"https:\/\/www.mhtechin.com\/support\/wp-json\/wp\/v2\/media?parent=3515"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.mhtechin.com\/support\/wp-json\/wp\/v2\/categories?post=3515"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.mhtechin.com\/support\/wp-json\/wp\/v2\/tags?post=3515"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}