MANDOC_HTML(3) Library Functions Manual MANDOC_HTML(3)

mandoc_htmlinternals of the mandoc HTML formatter

#include <sys/types.h>

#include "mandoc.h"
#include "roff.h"
#include "out.h"
#include "html.h"

void
print_gen_decls(struct html *h);

void
print_gen_comment(struct html *h, struct roff_node *n);

void
print_gen_head(struct html *h);

struct tag *
print_otag(struct html *h, enum htmltag tag, const char *fmt, ...);

void
print_tagq(struct html *h, const struct tag *until);

void
print_stagq(struct html *h, const struct tag *suntil);

void
html_close_paragraph(struct html *h);

enum roff_tok
html_fillmode(struct html *h, enum roff_tok tok);

int
html_setfont(struct html *h, enum mandoc_esc font);

void
print_text(struct html *h, const char *word);

void
print_tagged_text(struct html *h, const char *word, struct roff_node *n);

char *
html_make_id(const struct roff_node *n, int unique);

struct tag *
print_otag_id(struct html *h, enum htmltag tag, const char *cattr, struct roff_node *n);

void
print_endline(struct html *h);

The mandoc HTML formatter is not a formal library. However, as it is compiled into more than one program, in particular mandoc(1) and man.cgi(8), and because it may be security-critical in some contexts, some documentation is useful to help to use it correctly and to prevent XSS vulnerabilities.

The formatter produces HTML output on the standard output. Since proper escaping is usually required and best taken care of at one central place, the language-specific formatters (*_html.c, see FILES) are not supposed to print directly to stdout using functions like printf(3), putc(3), puts(3), or write(2). Instead, they are expected to use the output functions declared in html.h and implemented as part of the main HTML formatting engine in html.c.

These structures are declared in html.h.

struct html
Internal state of the HTML formatter.
struct tag
One entry for the LIFO stack of HTML elements. Members include enum htmltag tag and struct tag *next.

The function () prints the opening ⟨!DOCTYPE⟩ declaration.

Print a class attribute.
Print a href attribute. This attribute letter can optionally be followed by a modifier letter. If followed by R, it formats the link as a local one by prefixing a ‘#’ character. If followed by I, it interpretes the argument as a header file name and generates a link using the mandoc(1) -O includes option. If followed by M, it takes two arguments instead of one, a manual page name and section, and formats them as a link to a manual page using the mandoc(1) -O man option.
Print an id attribute.
Print an arbitrary attribute. This format letter requires two char * arguments, the attribute name and the value. The name must not be NULL.
Print a style attribute. If present, it must be the last format letter. It requires two char * arguments. The first is the name of the style property, the second its value. The name must not be NULL. The s fmt letter can be repeated, each repetition requiring an additional pair of char * arguments.

The function () switches to fill mode if want is ROFF_fi or to no-fill mode if want is ROFF_nf. Switching from fill mode to no-fill mode closes the current paragraph and opens a ⟨PRE⟩ element. Switching in the opposite direction closes the ⟨PRE⟩ element, but does not open a new paragraph. If want matches the mode that is already active, no elements are closed nor opened. If want is TOKEN_NONE, the mode remains as it is.

The function () selects the font, which can be ESCAPE_FONTROMAN, ESCAPE_FONTBOLD, ESCAPE_FONTITALIC, ESCAPE_FONTBI, or ESCAPE_FONTCW, for future text output and internally remembers the font that was active before the change. If the font argument is ESCAPE_FONTPREV, the current and the previous font are exchanged. This function only changes the internal state of the h object; no HTML elements are written yet. Subsequent text output will write font elements when needed.

The function () allocates a string to be used for the id attribute of an HTML element and/or as a segment identifier for a URI in an ⟨A⟩ element. If n contains a tag attribute, it is used; otherwise, child nodes are used. If n is an Sh, Ss, Sx, SH, or SS node, the resulting string is the concatenation of the child strings; for other node types, only the first child is used. Bytes not permitted in URI-fragment strings are replaced by underscores. If any of the children to be used is not a text node, no string is generated and NULL is returned instead. If the unique argument is non-zero, deduplication is performed by appending an underscore and a decimal integer, if necessary. If the unique argument is 1, this is assumed to be the first call for this tag at this location, typically for use by NODE_ID, so the integer is incremented before use. If the unique argument is 2, this is ssumed to be the second call for this tag at this location, typically for use by NODE_HREF, so the existing integer, if any, is used without incrementing it.

The functions print_otag() and print_otag_id() return a pointer to a new element on the stack of HTML elements. When print_otag_id() opens two elements, a pointer to the outer one is returned. The memory pointed to is owned by the library and is automatically free(3)d when print_tagq() is called on it or when print_stagq() is called on a parent element.

The function html_fillmode() returns ROFF_fi if fill mode was active before the call or ROFF_nf otherwise.

The function html_make_id() returns a newly allocated string or NULL if n lacks text data to create the attribute from. The caller is responsible for free(3)ing the returned string after using it.

In case of malloc(3) failure, these functions do not return but call err(3).

main.h
declarations of public functions for use by the main program, not yet documented
html.h
declarations of data types and private functions for use by language-specific HTML formatters
html.c
main HTML formatting engine and utility functions
mdoc_html.c
mdoc(7) HTML formatter
man_html.c
man(7) HTML formatter
tbl_html.c
tbl(7) HTML formatter
eqn_html.c
eqn(7) HTML formatter
roff_html.c
roff(7) HTML formatter, handling requests like br, ce, fi, ft, nf, rj, and sp.
out.h
declarations of data types and private functions for shared use by all mandoc formatters, not yet documented
out.c
private functions for shared use by all mandoc formatters
mandoc_aux.h
declarations of common mandoc utility functions, see mandoc(3)
mandoc_aux.c
implementation of common mandoc utility functions

mandoc(1), mandoc(3), man.cgi(8)

The mandoc HTML formatter was written by Kristaps Dzonsons <kristaps@bsd.lv>. It is maintained by Ingo Schwarze <schwarze@openbsd.org>, who also wrote this manual.

April 24, 2020 OpenBSD 6.7