| Class | SM::SimpleMarkup |
| In: |
markup/simple_markup.rb
|
| Parent: | Object |
This code converts input_string, which is in the format described in markup/simple_markup.rb, to HTML. The conversion takes place in the convert method, so you can use the same SimpleMarkup object to convert multiple input strings.
require 'rdoc/markup/simple_markup' require 'rdoc/markup/simple_markup/to_html' p = SM::SimpleMarkup.new h = SM::ToHtml.new puts p.convert(input_string, h)
You can extend the SimpleMarkup parser to recognise new markup sequences, and to add special processing for text that matches a regular epxression. Here we make WikiWords significant to the parser, and also make the sequences {word} and <no>text...</no> signify strike-through text. When then subclass the HTML output class to deal with these:
require 'rdoc/markup/simple_markup'
require 'rdoc/markup/simple_markup/to_html'
class WikiHtml < SM::ToHtml
def handle_special_WIKIWORD(special)
"<font color=red>" + special.text + "</font>"
end
end
p = SM::SimpleMarkup.new
p.add_word_pair("{", "}", :STRIKE)
p.add_html("no", :STRIKE)
p.add_special(/\b([A-Z][a-z]+[A-Z]\w+)/, :WIKIWORD)
h = WikiHtml.new
h.add_tag(:STRIKE, "<strike>", "</strike>")
puts "<body>" + p.convert(ARGF.read, h) + "</body>"
missing
| SPACE | = | ?\s | ||
| SIMPLE_LIST_RE | = | /^( ( \* (?# bullet) |- (?# bullet) |\d+\. (?# numbered ) |[A-Za-z]\. (?# alphabetically numbered ) ) \s+ )\S/x |
List entries look like:
* text 1. text [label] text label:: text Flag it as a list entry, and work out the indent for subsequent lines |
|
| LABEL_LIST_RE | = | /^( ( \[.*?\] (?# labeled ) |\S.*:: (?# note ) )(?:\s+|$) )/x |
take a block of text and use various heuristics to determine it‘s structure (paragraphs, lists, and so on). Invoke an event handler as we identify significant chunks.
# File markup/simple_markup.rb, line 207
207: def initialize
208: @am = AttributeManager.new
209: @output = nil
210: @block_exceptions = nil
211: end
Add to the sequences recognized as general markup
# File markup/simple_markup.rb, line 226
226: def add_html(tag, name)
227: @am.add_html(tag, name)
228: end
Add to other inline sequences. For example, we could add WikiWords using something like:
parser.add_special(/\b([A-Z][a-z]+[A-Z]\w+)/, :WIKIWORD)
Each wiki word will be presented to the output formatter via the accept_special method
# File markup/simple_markup.rb, line 240
240: def add_special(pattern, name)
241: @am.add_special(pattern, name)
242: end
Add to the sequences used to add formatting to an individual word (such as bold). Matching entries will generate attibutes that the output formatters can recognize by their name
# File markup/simple_markup.rb, line 218
218: def add_word_pair(start, stop, name)
219: @am.add_word_pair(start, stop, name)
220: end
Look through the text at line indentation. We flag each line as being Blank, a paragraph, a list element, or verbatim text
# File markup/simple_markup.rb, line 274
274: def assign_types_to_lines(margin = 0, level = 0)
275: now_blocking = false
276: while line = @lines.next
277:
278: if line.isBlank?
279: line.stamp(Line::BLANK, level)
280: next
281: end
282:
283: # if a line contains non-blanks before the margin, then it must belong
284: # to an outer level
285:
286: text = line.text
287:
288: for i in 0...margin
289: if text[i] != SPACE
290: @lines.unget
291: return
292: end
293: end
294:
295: active_line = text[margin..-1]
296:
297: #
298: # block_exceptions checking
299: #
300: if @block_exceptions
301: if now_blocking
302: line.stamp(Line::PARAGRAPH, level)
303: @block_exceptions.each{ |be|
304: if now_blocking == be['name']
305: be['replaces'].each{ |rep|
306: line.text.gsub!(rep['from'], rep['to'])
307: }
308: end
309: if now_blocking == be['name'] && line.text =~ be['end']
310: now_blocking = false
311: break
312: end
313: }
314: next
315: else
316: @block_exceptions.each{ |be|
317: if line.text =~ be['start']
318: now_blocking = be['name']
319: line.stamp(Line::PARAGRAPH, level)
320: break
321: end
322: }
323: next if now_blocking
324: end
325: end
326:
327:
328: # Rules (horizontal lines) look like
329: #
330: # --- (three or more hyphens)
331: #
332: # The more hyphens, the thicker the rule
333: #
334:
335: if /^(---+)\s*$/ =~ active_line
336: line.stamp(Line::RULE, level, $1.length-2)
337: next
338: end
339:
340: # Then look for list entries. First the ones that have to have
341: # text following them (* xxx, - xxx, and dd. xxx)
342:
343: if SIMPLE_LIST_RE =~ active_line
344:
345: offset = margin + $1.length
346: prefix = $2
347: prefix_length = prefix.length
348:
349: flag = case prefix
350: when "*","-" then ListBase::BULLET
351: when /^\d/ then ListBase::NUMBER
352: when /^[A-Z]/ then ListBase::UPPERALPHA
353: when /^[a-z]/ then ListBase::LOWERALPHA
354: else raise "Invalid List Type: #{self.inspect}"
355: end
356:
357: line.stamp(Line::LIST, level+1, prefix, flag)
358: text[margin, prefix_length] = " " * prefix_length
359: assign_types_to_lines(offset, level + 1)
360: next
361: end
362:
363:
364: if LABEL_LIST_RE =~ active_line
365: offset = margin + $1.length
366: prefix = $2
367: prefix_length = prefix.length
368:
369: next if handled_labeled_list(line, level, margin, offset, prefix)
370: end
371:
372: # Headings look like
373: # = Main heading
374: # == Second level
375: # === Third
376: #
377: # Headings reset the level to 0
378:
379: if active_line[0] == ?= and active_line =~ /^(=+)\s*(.*)/
380: prefix_length = $1.length
381: prefix_length = 6 if prefix_length > 6
382: line.stamp(Line::HEADING, 0, prefix_length)
383: line.strip_leading(margin + prefix_length)
384: next
385: end
386:
387: # If the character's a space, then we have verbatim text,
388: # otherwise
389:
390: if active_line[0] == SPACE
391: line.strip_leading(margin) if margin > 0
392: line.stamp(Line::VERBATIM, level)
393: else
394: line.stamp(Line::PARAGRAPH, level)
395: end
396: end
397: end
for debugging, we allow access to our line contents as text
# File markup/simple_markup.rb, line 498
498: def content
499: @lines.as_text
500: end
We take a string, split it into lines, work out the type of each line, and from there deduce groups of lines (for example all lines in a paragraph). We then invoke the output formatter using a Visitor to display the result
# File markup/simple_markup.rb, line 250
250: def convert(str, op, block_exceptions=nil)
251: @lines = Lines.new(str.split(/\r?\n/).collect { |aLine|
252: Line.new(aLine) })
253: return "" if @lines.empty?
254: @lines.normalize
255: @block_exceptions = block_exceptions
256: assign_types_to_lines
257: group = group_lines
258: # call the output formatter to handle the result
259: # group.to_a.each {|i| p i}
260: group.accept(@am, op)
261: end
for debugging, return the list of line types
# File markup/simple_markup.rb, line 504
504: def get_line_types
505: @lines.line_types
506: end
Return a block consisting of fragments which are paragraphs, list entries or verbatim text. We merge consecutive lines of the same type and level together. We are also slightly tricky with lists: the lines following a list introduction look like paragraph lines at the next level, and we remap them into list entries instead
# File markup/simple_markup.rb, line 469
469: def group_lines
470: @lines.rewind
471:
472: inList = false
473: wantedType = wantedLevel = nil
474:
475: block = LineCollection.new
476: group = nil
477:
478: while line = @lines.next
479: if line.level == wantedLevel and line.type == wantedType
480: group.add_text(line.text)
481: else
482: group = block.fragment_for(line)
483: block.add(group)
484: if line.type == Line::LIST
485: wantedType = Line::PARAGRAPH
486: else
487: wantedType = line.type
488: end
489: wantedLevel = line.type == Line::HEADING ? line.param : line.level
490: end
491: end
492:
493: block.normalize
494: block
495: end
Handle labeled list entries, We have a special case to deal with. Because the labels can be long, they force the remaining block of text over the to right:
| this is a long label that I wrote: | and here is the block of text with a silly margin |
So we allow the special case. If the label is followed by nothing, and if the following line is indented, then we take the indent of that line as the new margin
| this is a long label that I wrote: | here is a more reasonably indented block which will ab attached to the label. |
# File markup/simple_markup.rb, line 416
416: def handled_labeled_list(line, level, margin, offset, prefix)
417: prefix_length = prefix.length
418: text = line.text
419: flag = nil
420: case prefix
421: when /^\[/
422: flag = ListBase::LABELED
423: prefix = prefix[1, prefix.length-2]
424: when /:$/
425: flag = ListBase::NOTE
426: prefix.chop!
427: else raise "Invalid List Type: #{self.inspect}"
428: end
429:
430: # body is on the next line
431:
432: if text.length <= offset
433: original_line = line
434: line = @lines.next
435: return(false) unless line
436: text = line.text
437:
438: for i in 0..margin
439: if text[i] != SPACE
440: @lines.unget
441: return false
442: end
443: end
444: i = margin
445: i += 1 while text[i] == SPACE
446: if i >= text.length
447: @lines.unget
448: return false
449: else
450: offset = i
451: prefix_length = 0
452: @lines.delete(original_line)
453: end
454: end
455:
456: line.stamp(Line::LIST, level+1, prefix, flag)
457: text[margin, prefix_length] = " " * prefix_length
458: assign_types_to_lines(offset, level + 1)
459: return true
460: end