What is HTML?
HTML stands for HyperText Markup Language. It’s a special language that web browsers understand.
It’s basically a bunch of instructions about the structure of the document.
Hypertext is probably the most important part of HTML letters because it means that you can link from one document to another one. Hypertext is a text file that contains links to other text files.
This is the foundation stone of web, which at the beginning wasn’t much more than few pages interconnected with links.
Today, the web is no longer just about text. Most of it is created by Hypermedia including video, photos, and music.
But if you strip down all the bells and whistles, you will still find this fundamental functionality of mutual linking between documents at the core of any modern web application including Facebook, Twitter, and Gmail.
Markup means that HTML surrounds regular text with a special code which tells the browser how to display the content of the document. It tells the browser what is the structure of the document.
This special code consists of the so-called tags which are a special annotation, but don’t worry, they are human-readable and make perfect sense even for the beginners.
For example, the <title> is the tag that says the browser that anything wrapped inside this tag is, well, the title.
This way, not only the browser but you, as a web developer as well, can perfectly understand what is the meaning of the text inside the <title> tag and format it accordingly.
Next example is the <p> tag which is a shortcut for paragraph and it says the web browser that the content of this tag should be displayed as a paragraph, which basically means on the new line. We will discuss the anatomy of tags in details later, but you might have already noticed that the <title> and the <p> tags have a fancy counterpart at the end which contains the forward slash.
In order to wrap the content, most of the tags have two parts, the opening part at the beginning of the content to be wrapped, and the closing part at the end. But there are tags with just the opening part. More on that later.
Language means that HTML has some special syntax or the rules you need to follow to create a syntactically correct content. Just like with other languages where subjects and verbs have a special place in the sentence, HTML tags have a special structure you need to follow.
For instance, you need to nest tags correctly which means that the closing part of the tag must be written in the right place.
Before 1997, there were no standards that the vendors of web browsers would abide. Vendors invented their own tags, they implemented tags differently, so the same tag looked and worked the different way on Internet Explorer and Netscape.
This was very hard for developers because they had to use all sorts of tricks to make their pages look at least a bit similar in different browsers. I remember vividly how excruciating it was for me and my colleagues to go through this process and I am happy that it’s all just a memory now :-).
Luckily, around 1997 the HTML 4.0 standard was published by World Wide Web Consortium (W3C) and some vendors started paying attention and changed their browsers so they respected this standard, at least to some extent.
Around 2000, W3C came up with the new standard called XHTML 1.0, but the browser vendors didn’t agree with where this standard was headed and also, they thought that the whole W3C standardization process was way too slow.
In 2004, browser vendors, or rather individuals of Apple, Mozilla Foundation and Opera Software created another group called WHATWG meaning Web Hypertext Application Technology Working Group.
WHATWG basically stands behind the current HTML 5 standard, because it pushes the changes further in a less democratic and quicker way.
Around 2008, those two organizations started to work together and they produced HTML 5 in 2011.
The outcome is, that they divided their work so W3C is maintaining the HTML 5 standard, while WHATWG is working on still evolving HTML and once in a while, new features make it to the standard.
These days, all modern browsers are updated automatically and with these updates, they introduce new features. This is a good thing because if the user of your website or web application has one of these modern browsers, you as a web developer probably don’t need to care much about if the new feature will be available to this user or not.
If he doesn’t forbid updates deliberately, new features will work for him most of the time.
But you, as a web developer, should still have an idea what is going on in this field, how the standard is evolving and what are the new features already implemented by the major browsers vendors, even though they might not yet make it to the standard.
For that, here are some resources:
www.caniuse.com – this is a great web, where you will just type the name of the HTML tag, CSS selector or another technology and it will tell you which browsers already implemented this feature and which didn’t.
validator.w3.org – this website checks the code you will provide it with telling you, if the code is standard compliant or not.
www.w3counter.com/globalstats.php – this a great source for up to date browser statistics showing what browsers are the most popular. This is important if you want to implement some non-standard feature, and you know that it’s not supported by some specific version of some browser. Based on how many users still use this version, you can decide, whether you will implement this feature even though some users might not be able to use it, or, whether you will rather support those users too, and not implement the feature.
As you can see in the screenshot above, as of January 2018, Chrome browser was crushing all the others having 58.4% of the market share, so if your code worked in Chrome but doesn’t work in let’s say Opera which had only 3.9% market share, it was probably a safe bet to implement this code anyway.
Anatomy of HTML tag
A tag is always made up of two building blocks: element name, and angle brackets. Together they create a tag. Most of the tags surround some content. Such tags have two parts.
There is an opening part, which is present in every tag, and there is a closing part, which some tags are missing. The opening part can have so-called attributes with values which further specify the behavior of the tag. The closing part has no such thing, only the forward slash followed by the name of the tag.
The opening part of the tag tells the browser where is the beginning of the content that should be treated by the browser in some special way. The closing part of the tag tells the browser where the content for applying special rules stops.
Only those tags that wrap some content need to have both opening and closing parts, but some tags don’t wrap content. Such tags lack the closing part. This is the case of images for example.
When writing tags, you need to be very careful about spacing. No space is allowed between the opening bracket of the opening part of the tag and the name of the element. Similarly, no space is allowed between the opening bracket of the closing part of the tag and forward slash.
However, there must be space between the name of the element and the name of the attribute. All other spaces are allowed but ignored by the browser.
It doesn’t matter if you use single or double quotes to wrap the value of the attribute. Actually, in HTML 5, you don’t even have to use quotes at all, but it’s a good practice to use them because with quotes the code is more readable.
You might still find in the code on websites something like this:
This is a so-called self-closing tag and it’s a relic from XHTML standard, which is no longer in use but still accepted by browsers who know how to display it correctly. Since these are not legal in HTML 5 standard, try to avoid using them.
Basic HTML document structure
Before we will dive into the description of the most popular HTML elements, I will show you the basic structure of the HTML document which you should always follow in order to have your website standard-compliant.
Every HTML document should start with the declaration of the type of the document:
It doesn’t matter if it’s lowercase or uppercase, just make sure there is no space between the first angle bracket, the exclamation mark, and the doctype text.
If you omit this declaration, the web browser will still display the page, but it will treat it as something not following the HTML 5 standard. For a non-compliant content, browsers sometimes use the so-called quirk-mode where the layout can look strange, styles might be applied differently than you would expect and so on.
So there’s a really good reason to use this declaration.
Next is the <html> tag followed by the closing counterpart </html> at the very end of the document:
Inside the <html> tag is the <head></head> tag and the <body></body> tag, both with their closing parts:
The <head> tag usually contains information needed to render the page properly, like character encoding, while the <body> tag contains the actual content of the page.
The last tag you need to add to have a standard compliant HTML file is the <title> tag which belongs inside the <head> tag and contains the title of the page.
So your code should look like this now:
<!doctype html> <html> <head> <title>My First Web Page</title> </head> <body></body> </html>
Go ahead and check on https://validator.w3.org/#validate_by_input if your code is valid. Just copy it to the clipboard, open the website and paste the code to the form.
Hit the big Check button and you should see the green message that no errors or warnings were found.
You might have noticed, that the <title> tag is inside the <head> tag which is inside the <html> tag.
This nesting is very frequent in HTML language and you need to be careful to properly place the closing part of the corresponding tag.
If you mismatch this, the browser will probably cope with it, but your code won’t be clean and standard compliant and it can get you in a lot of troubles once you have plenty of code in your file.
Some code editors help you with the proper nesting by allowing you to fold and expand the code based on the nesting.
Both Atom and Brackets have
When you look again at our code, you can see that there’s a space at the beginning of each new line. Sometimes there’s more space, sometimes less.
You might have noticed, that the amount of space corresponds to the level of the nesting of the tag. The more the tag is nested, the more space is added.
This is called the indentation. Space is made by the TAB key and the indentation makes it very easy to read the code.
It’s a good idea to specify the character set so the browser knows how to display specific characters.
This is done with the <meta> tag which belongs inside the <head> tag.
You can’t probably go wrong with Unicode charset.
Open the index.html file you’ve saved on Desktop and insert this tag inside the <head> tag:
Two things to notice here:
- The <meta> tag has no closing part
- We specified the charset attribute and assigned it a value of utf-8
The <body> tag is for all the content of the page. Go ahead and type something special between the opening part and the closing part of the <body> tag.
Your final code should look like this now:
<!doctype html> <html> <head> <meta charset=“utf-8”> <title>My First Web Page</title> </head> <body> This is my first web page! :-) </body> </html>
Save the changes, locate the index.html file on Desktop and
Remember that browser reads the code like we humans do (from the top to the bottom) and it renders the content the same way.
This might not look like something strange, but it will be very important in future, when we will talk about linking other content to the page and how the position of this content alters the look and behavior of the page.
Sometimes, you need a portion of your code to be ignored by the browser.
You can select the code to be ignored and comment it out.
To do that, you will insert <!– at the beginning of the code you want to be ignored and –> at the end.
In modern editors, like Atom or Brackets, such code will be greyed out, giving you the hint that this code won’t be interpreted by the browser.
Prior to HTML 5, there were two types of elements, block-level elements,
This specified which element can be nested inside another element.
With HTML 5, though, things got more complicated, because, now we have not 2, but 7 types of elements.
However, understanding those two old categories is still very practical, because they work very well with the existing CSS rules.
By default, block-level elements are rendered by the browser to begin on a new line. You can change this behavior via CSS, but we’ll get to that later.
They are allowed to contain inline elements or other block-level elements within them.
They are roughly equivalent to the new HTML 5 category called Flow Content.
The most generic block-level element is the <div> element which stands for division.
Inline elements are by default rendered on the same line. Again, you can change this with CSS, and I will show you later how to do it.
They are allowed to contain only other inline elements, but not block-level elements.
They roughly behave as a new HTML 5 category called Phrasing Content.
The most generic inline element is the <span> element.
Let’s take a look at the example to see the difference between block-level elements and inline elements.
We will use two basic elements, <p> and <a>.
The <p> tag represents paragraph and this element is suitable for text that you want to format as a paragraph of some article.
The <a> tag represents anchor and it is the element for linking one document to another document or to create a link to the specific section of the same document.
We will talk about these elements and many more in details, but for now, this should suffice.
It happens that the <p> tag is by default defined as a block-level element while the <a> tag is by default defined as an inline element.
Let’s take a look at this code:
<!doctype html> <html> <head> <meta charset="utf-8"> <title>Block-Level vs Inline Elements</title> </head> <body> <p>*** PARAGRAPH 1: This is the first paragraph ***</p> <p>*** PARAGRAPH 2: This is the second paragraph ***</p> <a href="#">LINK 1: This is the first link</a> <p> *** PARAGRAPH 3: This is the third paragraph *** <a href="#">LINK 2: This is the second link</a> </p> </body> </html>
Create a new file called block-vs-inline.html, save it to your Desktop and place this code inside. Save the changes and open this file in the web browser.
As you can see, the PARAGRAPH 1, PARAGRAPH 2 and even the LINK 1 are all on the new lines.
To help you better understand what is going on here, we will use Chrome Developer Tools.
While still in the browser, click the right mouse button anywhere on the blank page and select Inspect from the contextual menu.
This will open a whole new world of tools and features, but don’t be intimidated by all the buttons and information if you see this for the first time. We will go through it all slowly together.
Depending on your screen resolution, try to change the layout of the panes so you see everything nicely. Try to get the layout similar to this:
Now, if you go to the pane with HTML code, you will see that this shows our code. Go ahead and click on the arrows next to the code to expand it all.
If you hover your mouse over the element name, you will see, that the corresponding element is highlighted in the left pane, showing you that very element.
And this is exactly what we will use to understand the document flow.
Block-level elements will always start on a new line and since they take the whole width of the line, they will push any other element to the new line. There is one exception to this rule. Once an inline element is nested in a block-level element, it won’t be pushed to the new line.
Inline elements fit nicely to the current flow, so they don’t start on the new line and they don’t push any other elements to the new line.
Even though this might be quite confusing, it’s really important to understand these basic rules, because we will make use of them in the future, especially once we’ll deal with the so-called responsive design.
When you move your mouse over the paragraph and the link elements, you will get those elements highlighted with blue color. This color indicates the space the element takes on the page. Don’t worry about the orange color above and below the blue color in case of a
Now try to change the width of the left pane where your page is displayed to make it wider and move your mouse over both elements again.
See? No matter how wide the pane is, the paragraph will always take up the whole width of the line, not leaving any space there for anything else.
The link element,
And just to be perfectly clear. This has nothing to do with the indentation.
Create a new file called no-indentation.html on your Desktop, place this code inside, save the changes and open the file in the web browser.
<!doctype html> <html> <head> <meta charset="utf-8"> <title>No Indentation</title> </head> <body> <p>*** PARAGRAPH 1: This is the first paragraph ***</p><p>*** PARAGRAPH 2: This is the second paragraph ***</p><a href="#">LINK 1: This is the first link</a><p>*** PARAGRAPH 3: This is the third paragraph ***<a href="#">LINK 2: This is the second link</a></p> </body> </html>
As you can see, even though the content of the body is on the same line, because we have removed all the new line characters and indentations, the document will render exactly the same way. But, it’s less friendly for the