Web Utilities

HTML Entity Encoder and Decoder

Convert reserved HTML characters to entities and decode common named or numeric entities.

Encoded

<span data-value="5 & 7">logic</span>

Decoded

<span data-value="5 & 7">logic</span>

HTML Entities and Safe Text Rendering

HTML entities represent characters using text sequences that are safe to place inside markup. The most common examples are ampersand, less-than, greater-than, quotation mark, and apostrophe. These characters can have structural meaning in HTML. A less-than sign can start a tag, a quotation mark can close an attribute value, and an ampersand can begin an entity. Encoding these characters prevents text from being interpreted as markup.

The entity for ampersand is amp, written with an ampersand and semicolon around the name. Less-than is lt, greater-than is gt, double quote is quot, and apostrophe is often represented as numeric entity 39. Numeric entities can also represent Unicode code points in decimal or hexadecimal form. This calculator handles the common reserved characters plus numeric entities, which covers the most frequent debugging and documentation cases.

Encoding is not the same as encryption. It does not hide data. It only changes how text is interpreted by an HTML parser. The value <script> displayed on a page is visible text, not an executable script tag. That distinction matters for security, documentation, code samples, templating, and content management systems.

Manual Encoding Steps

Suppose the text is <span data-value="5 & 7">logic</span>. To show that as literal text inside an HTML page, replace ampersand first so existing entity starts are not confused. Then replace less-than with lt, greater-than with gt, quotes with quot when needed, and apostrophes with a safe entity. The encoded result can be stored or displayed without the browser treating it as an actual span element.

Where Entities Matter

Entities are used in blog posts, documentation pages, code examples, CMS fields, static site generators, email templates, and server-rendered web applications. They are especially important when showing snippets of HTML, XML, SVG, or template syntax. Without encoding, the browser may hide part of the example, create unintended elements, or expose a security problem if untrusted input is inserted into a page.

Attribute context deserves extra care. Text inside an HTML attribute must not break out of the surrounding quote. Encoding quotes is therefore important when user-controlled text is inserted into quoted attributes. URL context, JavaScript string context, and CSS context have their own escaping rules; HTML entity encoding alone is not a universal sanitizer for every context.

Security Considerations

Cross-site scripting prevention requires context-aware output encoding and often additional sanitization. If a system allows rich HTML from users, simply encoding or decoding entities is not enough to decide what is safe. A trusted HTML sanitizer should remove dangerous elements and attributes. Frameworks such as React escape text by default, but dangerous raw HTML insertion APIs still require careful review.

This tool is meant for developer convenience: preparing examples, decoding snippets, understanding logs, and checking how reserved characters transform. Production security decisions should follow the rules of the target templating system and the specific browser context where the data will be rendered.

Decoding should be handled carefully when data crosses trust boundaries. A string may be encoded more than once, or it may contain a mixture of literal characters and entities. If a server decodes input and later inserts it into HTML without encoding, a previously harmless-looking value can become active markup. Many security bugs come from doing the correct transformation at the wrong stage of a pipeline.

HTML entities are also different from URL encoding. The text %3C is a URL-encoded less-than sign, while < is an HTML entity. JSON escaping, shell escaping, SQL parameterization, and HTML entity encoding solve different problems. A reliable application chooses the encoding for the output context, not just for the character being represented. That is why a small entity converter can be useful during debugging: it makes the context visible.

Documentation workflows benefit from entity encoding because examples must remain examples. A tutorial that shows a script tag, an SVG snippet, or an XML fragment should encode the markup so readers see the code rather than the browser interpreting it. The same principle applies to comments, changelogs, generated docs, and developer support messages.

Manual Verification Workflow

A manual entity check should encode ampersands first, then angle brackets, then quotes if the text will appear in an attribute. Encoding ampersand first prevents an already-created entity from being partly reinterpreted. To decode, resolve named or numeric entities back to characters and then ask where the decoded text will be placed. Decoded text that is safe as plain text may be unsafe inside raw HTML. This context check is why entity tools are useful for debugging but should not be treated as complete security sanitizers.