Discussioni indice:Polo - Il milione, Pagani, Firenze 1827, I.djvu
Special test formatting
modifica- Il milione contains a sequence of numbered small chapters (but the first one, it is considered chapter 0). The whole chapter (text only, variants excluded) is wrapped into section tags, whose
malename is the same of the chapter number (section begin="0", section begin="1"...) - Variants list of the chapter are wrapped into section tags named v+chapter number (section begin="v0", section begin="v1"....)
- Every chapter is transcluded into a its own ns0 page, using two pages tags, both with onlysection tag (pages.... onlysection="0" + pages.... onlysection="v0"
- Variants are numbered; any variant is wrapped into a § template (that adds a named anchor, the name being parameter 1 of template). The name of the anchor is number of variant_number of book page (the first, second, third variant of page 1 are 1_1, 2_1, 3_1);
- The text related to variant is wrapped into a link pointing to the anchor.
See Pagina:Polo_-_Il_milione,_Pagani,_Firenze_1827,_I.djvu/221 and related ns0 page Il milione (Pagani, 1827)/Il Milione di Marco Polo, Testo della Crusca/Introduzione for a running example.
@Valp Presently this convention is discussed into Wikisource:Bar#Il_milione:_OK_alla_formattazione-test.3F.
- @Alex brollo Merci Alex. I do understand but this : “section tags, whose male is the same of the chapter number”. Does male means that the first section of a page should have the same number than the page (p.5 => 1st section begin="5")? In fact page 2 (djvu/222) has: begin="1" ; pages 2, 3 & 4: begin="2" ; page 5: begin="3". Quid?
- Sections name (not male.... :-( ) doesn't mirror the page number, but the chapter number. So, chapter 1 begins into book page 1 (djvu page 221) and it ends into book page 2 (djvu page 222); both pieces are wrapped into a <section begin="1" />... <section end="1" />. This allows to use simply the code <pages index="Polo - Il milione, Pagani, Firenze 1827, I.djvu" from=221 to=222 onlysection="1" />; really you could too write <pages index="Polo - Il milione, Pagani, Firenze 1827, I.djvu" from=1 to=300 onlysection="1" /> getting the same result, since onlysection means "transclude only text wrapped into sections named "1", don't care so much about pages interval". I use onlysection tag in special cases, with excellent results. --Alex brollo (disc.) 13:28, 19 gen 2017 (CET)
- PS: did you guess the goal of next "memoRegex" section into this page? :-) --Alex brollo (disc.) 13:29, 19 gen 2017 (CET)
- Sections name (not male.... :-( ) doesn't mirror the page number, but the chapter number. So, chapter 1 begins into book page 1 (djvu page 221) and it ends into book page 2 (djvu page 222); both pieces are wrapped into a <section begin="1" />... <section end="1" />. This allows to use simply the code <pages index="Polo - Il milione, Pagani, Firenze 1827, I.djvu" from=221 to=222 onlysection="1" />; really you could too write <pages index="Polo - Il milione, Pagani, Firenze 1827, I.djvu" from=1 to=300 onlysection="1" /> getting the same result, since onlysection means "transclude only text wrapped into sections named "1", don't care so much about pages interval". I use onlysection tag in special cases, with excellent results. --Alex brollo (disc.) 13:28, 19 gen 2017 (CET)
- @Alex brollo As for the Sections, now it is clear: same number as the chapter (and I highly approve the numeration with 0 for the Avant-Propos that the famous¨Paris.ms.1116 erroneously numbers as 1 !). As for the RegEx etc., see below.
- @Alex brollo Now you may check if I did understand well, as I finished djvu/225-232 and created pp. 5, 6 & 7. When you have time. Do tell me what is wrong or not enough (except the Italian spelling, of course...).
Completed
modifica@Alex brollo Wednesday 25 I completed up to: I.djvu/243 and up to p. ..della_Crusca/25. You may check the whole, especially ..della_Crusca/9 where there is a strange reduplication of the Varianti. By. -Valp (disc.) 16:53, 25 gen 2017 (CET)
Samedi 28 jan. I.djvu/353..355
memoRegex
modifica{"^\\d+ .+\\n":["Eliminazione riga header che inizia con numero(regex)","","g"], "Dig.+ by .+gle":["Eliminazione Digitized by Google (regex)","","g"], "cbe":["cbe -> che","che","g"], "\\ c\\ ":["c isolato per e"," e ","g"], "qn":["inversione n in u","qu","g"], "([^aeiouAEIOU])’ +":["Normalizzazione spazi dopo apostrofo (regex)","$1’","g"], "(\\w)[ ]([;,:\\.?!])":["Normalizzazione spazi attorno alla punteggiatura (regex)","$1$2","g"], "\\n:":["due punti a inizio riga (regex)",":","g"], "1’":["scanno comune per l'","l’","g"], "\\ cosi\\ ":["scanno comune per così"," così ","g"], "’1":["scanno comune per 'l","’l","g"], "é":["","è","g"], "\\(\\ ":["","(","g"], "\\ \\)":["",")","g"], "\\t":["(regex)"," ","g"], "\\ \\ ":[""," ","g"]}
mM comments
modifica- @Alex brollo This seems wonderfull but I do not know how to use it. (My app retrieves from the selected text of a page under modification, and according to the clic on my menu, the app copies to the Clipboard a processed text: so I paste it directly onto the selection with Ctrl+V.) -Valp (disc.) 01:33, 19 gen 2017 (CET)
- Your procedure is very interesting, sometimes I too edit text with an external text processor (notepad++), but for "massive edits" only (I download the whole text layer of the book, then I edit it with global regex substitutions, finally I upload edited text into nsPage by a bot). Can you give me some more details about your application?
- About memoRegex: it is a trick to save, and run again, any substitution got by a Find & Replace tool. If you like to try it you need to activate "Strumenti per la rilettura" gadget, and "memoRegex" gadget. But if you feel comfortable with your editing method, don't waste your time! Alex brollo (disc.) 15:35, 19 gen 2017 (CET)
- @Alex brollo My way is to use an exe created with Visual Basic (as I never got into the C and JS code as too abstract). First I use a "ShellWindows" Object, from which I catch the WikiSource Page under modification when in Internet Explorer (some JS tricks could likely do the same from Chrome, I would be happy to learn it). Then I have two wonderfull lines telling:
- Set htRg = IE.Document.Selection.CreateRange
- If TypeName(htRg) = "IHTMLTxtRange" Then WhatIsSelected = htRg.Text
- so I may process WhatIsSelected as neaded (VB being easier and more powerfull than regular expressions) and pass the result to the ClipBoard, and then Ctrl+V directly replaces WhatIsSelected in the page under modification.
- The app shows a tiny 3x5cm menu which, after 2-days development, includes 18 functions.
- But it is still necessary for you to tell me your convenzioni, for instance :
- dell' [aeiouAEIOU] or dell'[aeiouAEIOU] or dell’ ie. chr(146) ?
- “a” or « a » or ‘a’ or ?? (quale virgolette?)
- So, thank you and best whishes. --Valp (disc.) 18:38, 19 gen 2017 (CET)
- There are some Italian rules and some Wikisource conventions; and there are tools to make simpler to apply them. Wikisource use apostrophe ’, chr(146) into text, and there are tools to convert typewriter ' into ’ (having care of avoid conversion of wiki markup, template and link content, and other cases: not a simple issue). A space is, or is not required after an apostrophe; a rough rule is: consonant+apostrophe+vowel = no space, example "dell'uomo, un'anima"; vowel+apostrophe+consonant = a space, example "un po' di pane".
- About quotation marks, it's simpler: they mirror the text source, there's no "wikisource style". Alex brollo (disc.) 21:41, 20 gen 2017 (CET)