| Class | SM::SimpleMarkup |
| In: |
markup/simple_markup.rb
|
| Parent: | Object |
This code converts input_string, which is in the format described in markup/simple_markup.rb, to HTML. The conversion takes place in the convert method, so you can use the same SimpleMarkup object to convert multiple input strings.
require 'rdoc/markup/simple_markup' require 'rdoc/markup/simple_markup/to_html' p = SM::SimpleMarkup.new h = SM::ToHtml.new puts p.convert(input_string, h)
You can extend the SimpleMarkup parser to recognise new markup sequences, and to add special processing for text that matches a regular epxression. Here we make WikiWords significant to the parser, and also make the sequences {word} and <no>text...</no> signify strike-through text. When then subclass the HTML output class to deal with these:
require 'rdoc/markup/simple_markup'
require 'rdoc/markup/simple_markup/to_html'
class WikiHtml < SM::ToHtml
def handle_special_WIKIWORD(special)
"<font color=red>" + special.text + "</font>"
end
end
p = SM::SimpleMarkup.new
p.add_word_pair("{", "}", :STRIKE)
p.add_html("no", :STRIKE)
p.add_special(/\b([A-Z][a-z]+[A-Z]\w+)/, :WIKIWORD)
h = WikiHtml.new
h.add_tag(:STRIKE, "<strike>", "</strike>")
puts "<body>" + p.convert(ARGF.read, h) + "</body>"
missing
| SPACE | = | ?\s | ||
| SIMPLE_LIST_RE | = | /^( ( \* (?# bullet) |- (?# bullet) |\d+\. (?# numbered ) |[A-Za-z]\. (?# alphabetically numbered ) ) \s+ )\S/x |
List entries look like:
* text 1. text [label] text label:: text Flag it as a list entry, and work out the indent for subsequent lines |
|
| LABEL_LIST_RE | = | /^( ( \[.*?\] (?# labeled ) |\S.*:: (?# note ) )(?:\s+|$) )/x |
take a block of text and use various heuristics to determine it‘s structure (paragraphs, lists, and so on). Invoke an event handler as we identify significant chunks.
# File markup/simple_markup.rb, line 207
207: def initialize
208: @am = AttributeManager.new
209: @output = nil
210: @block_exceptions = nil
211: end
Add to the sequences recognized as general markup
# File markup/simple_markup.rb, line 226
226: def add_html(tag, name)
227: @am.add_html(tag, name)
228: end
Add to other inline sequences. For example, we could add WikiWords using something like:
parser.add_special(/\b([A-Z][a-z]+[A-Z]\w+)/, :WIKIWORD)
Each wiki word will be presented to the output formatter via the accept_special method
# File markup/simple_markup.rb, line 240
240: def add_special(pattern, name)
241: @am.add_special(pattern, name)
242: end
Add to the sequences used to add formatting to an individual word (such as bold). Matching entries will generate attibutes that the output formatters can recognize by their name
# File markup/simple_markup.rb, line 218
218: def add_word_pair(start, stop, name)
219: @am.add_word_pair(start, stop, name)
220: end
Look through the text at line indentation. We flag each line as being Blank, a paragraph, a list element, or verbatim text
# File markup/simple_markup.rb, line 274
274: def assign_types_to_lines(margin = 0, level = 0)
275: now_blocking = false
276: while line = @lines.next
277: if @block_exceptions
278: if now_blocking
279: line.stamp(Line::PARAGRAPH, level)
280: @block_exceptions.each{ |be|
281: if now_blocking == be['name']
282: be['replaces'].each{ |rep|
283: line.text.gsub!(rep['from'], rep['to'])
284: }
285: end
286: if now_blocking == be['name'] && line.text =~ be['end']
287: now_blocking = false
288: break
289: end
290: }
291: next
292: else
293: @block_exceptions.each{ |be|
294: if line.text =~ be['start']
295: now_blocking = be['name']
296: line.stamp(Line::PARAGRAPH, level)
297: break
298: end
299: }
300: next if now_blocking
301: end
302: end
303:
304: if line.isBlank?
305: line.stamp(Line::BLANK, level)
306: next
307: end
308:
309: # if a line contains non-blanks before the margin, then it must belong
310: # to an outer level
311:
312: text = line.text
313:
314: for i in 0...margin
315: if text[i] != SPACE
316: @lines.unget
317: return
318: end
319: end
320:
321: active_line = text[margin..-1]
322:
323: # Rules (horizontal lines) look like
324: #
325: # --- (three or more hyphens)
326: #
327: # The more hyphens, the thicker the rule
328: #
329:
330: if /^(---+)\s*$/ =~ active_line
331: line.stamp(Line::RULE, level, $1.length-2)
332: next
333: end
334:
335: # Then look for list entries. First the ones that have to have
336: # text following them (* xxx, - xxx, and dd. xxx)
337:
338: if SIMPLE_LIST_RE =~ active_line
339:
340: offset = margin + $1.length
341: prefix = $2
342: prefix_length = prefix.length
343:
344: flag = case prefix
345: when "*","-" then ListBase::BULLET
346: when /^\d/ then ListBase::NUMBER
347: when /^[A-Z]/ then ListBase::UPPERALPHA
348: when /^[a-z]/ then ListBase::LOWERALPHA
349: else raise "Invalid List Type: #{self.inspect}"
350: end
351:
352: line.stamp(Line::LIST, level+1, prefix, flag)
353: text[margin, prefix_length] = " " * prefix_length
354: assign_types_to_lines(offset, level + 1)
355: next
356: end
357:
358:
359: if LABEL_LIST_RE =~ active_line
360: offset = margin + $1.length
361: prefix = $2
362: prefix_length = prefix.length
363:
364: next if handled_labeled_list(line, level, margin, offset, prefix)
365: end
366:
367: # Headings look like
368: # = Main heading
369: # == Second level
370: # === Third
371: #
372: # Headings reset the level to 0
373:
374: if active_line[0] == ?= and active_line =~ /^(=+)\s*(.*)/
375: prefix_length = $1.length
376: prefix_length = 6 if prefix_length > 6
377: line.stamp(Line::HEADING, 0, prefix_length)
378: line.strip_leading(margin + prefix_length)
379: next
380: end
381:
382: # If the character's a space, then we have verbatim text,
383: # otherwise
384:
385: if active_line[0] == SPACE
386: line.strip_leading(margin) if margin > 0
387: line.stamp(Line::VERBATIM, level)
388: else
389: line.stamp(Line::PARAGRAPH, level)
390: end
391: end
392: end
for debugging, we allow access to our line contents as text
# File markup/simple_markup.rb, line 493
493: def content
494: @lines.as_text
495: end
We take a string, split it into lines, work out the type of each line, and from there deduce groups of lines (for example all lines in a paragraph). We then invoke the output formatter using a Visitor to display the result
# File markup/simple_markup.rb, line 250
250: def convert(str, op, block_exceptions=nil)
251: @lines = Lines.new(str.split(/\r?\n/).collect { |aLine|
252: Line.new(aLine) })
253: return "" if @lines.empty?
254: @lines.normalize
255: @block_exceptions = block_exceptions
256: assign_types_to_lines
257: group = group_lines
258: # call the output formatter to handle the result
259: # group.to_a.each {|i| p i}
260: group.accept(@am, op)
261: end
for debugging, return the list of line types
# File markup/simple_markup.rb, line 499
499: def get_line_types
500: @lines.line_types
501: end
Return a block consisting of fragments which are paragraphs, list entries or verbatim text. We merge consecutive lines of the same type and level together. We are also slightly tricky with lists: the lines following a list introduction look like paragraph lines at the next level, and we remap them into list entries instead
# File markup/simple_markup.rb, line 464
464: def group_lines
465: @lines.rewind
466:
467: inList = false
468: wantedType = wantedLevel = nil
469:
470: block = LineCollection.new
471: group = nil
472:
473: while line = @lines.next
474: if line.level == wantedLevel and line.type == wantedType
475: group.add_text(line.text)
476: else
477: group = block.fragment_for(line)
478: block.add(group)
479: if line.type == Line::LIST
480: wantedType = Line::PARAGRAPH
481: else
482: wantedType = line.type
483: end
484: wantedLevel = line.type == Line::HEADING ? line.param : line.level
485: end
486: end
487:
488: block.normalize
489: block
490: end
Handle labeled list entries, We have a special case to deal with. Because the labels can be long, they force the remaining block of text over the to right:
| this is a long label that I wrote: | and here is the block of text with a silly margin |
So we allow the special case. If the label is followed by nothing, and if the following line is indented, then we take the indent of that line as the new margin
| this is a long label that I wrote: | here is a more reasonably indented block which will ab attached to the label. |
# File markup/simple_markup.rb, line 411
411: def handled_labeled_list(line, level, margin, offset, prefix)
412: prefix_length = prefix.length
413: text = line.text
414: flag = nil
415: case prefix
416: when /^\[/
417: flag = ListBase::LABELED
418: prefix = prefix[1, prefix.length-2]
419: when /:$/
420: flag = ListBase::NOTE
421: prefix.chop!
422: else raise "Invalid List Type: #{self.inspect}"
423: end
424:
425: # body is on the next line
426:
427: if text.length <= offset
428: original_line = line
429: line = @lines.next
430: return(false) unless line
431: text = line.text
432:
433: for i in 0..margin
434: if text[i] != SPACE
435: @lines.unget
436: return false
437: end
438: end
439: i = margin
440: i += 1 while text[i] == SPACE
441: if i >= text.length
442: @lines.unget
443: return false
444: else
445: offset = i
446: prefix_length = 0
447: @lines.delete(original_line)
448: end
449: end
450:
451: line.stamp(Line::LIST, level+1, prefix, flag)
452: text[margin, prefix_length] = " " * prefix_length
453: assign_types_to_lines(offset, level + 1)
454: return true
455: end