8.1.12.1.1. cltk.prosody.lat package¶
8.1.12.1.1.1. Submodules¶
8.1.12.1.1.2. cltk.prosody.lat.clausulae_analysis module¶
Return dictionary of clausulae found in the prosody of Latin prose.
The clausulae analysis function returns a dictionary in which the key is the type of clausula and the value is the number of times it occurs in the text. The list of clausulae used in the method is derived from the 2019 Journal of Roman Studies paper “Auceps syllabarum: A Digital Analysis of Latin Prose Rhythm”. The list of clausulae are mutually exclusive so no one rhythm will be counted in multiple categories.
-
class
cltk.prosody.lat.clausulae_analysis.
Clausula
(rhythm_name, rhythm)¶ Bases:
tuple
-
property
rhythm
¶ Alias for field number 1
-
property
rhythm_name
¶ Alias for field number 0
-
property
-
class
cltk.prosody.lat.clausulae_analysis.
Clausulae
(rhythms=[Clausula(rhythm_name='cretic_trochee', rhythm='-u--x'), Clausula(rhythm_name='cretic_trochee_resolved_a', rhythm='uuu--x'), Clausula(rhythm_name='cretic_trochee_resolved_b', rhythm='-uuu-x'), Clausula(rhythm_name='cretic_trochee_resolved_c', rhythm='-u-uux'), Clausula(rhythm_name='double_cretic', rhythm='-u--ux'), Clausula(rhythm_name='molossus_cretic', rhythm='----ux'), Clausula(rhythm_name='double_molossus_cretic_resolved_a', rhythm='uuu--ux'), Clausula(rhythm_name='double_molossus_cretic_resolved_b', rhythm='-uuu-ux'), Clausula(rhythm_name='double_molossus_cretic_resolved_c', rhythm='-u-uux'), Clausula(rhythm_name='double_molossus_cretic_resolved_d', rhythm='uu---ux'), Clausula(rhythm_name='double_molossus_cretic_resolved_e', rhythm='-uu--ux'), Clausula(rhythm_name='double_molossus_cretic_resolved_f', rhythm='--uu-ux'), Clausula(rhythm_name='double_molossus_cretic_resolved_g', rhythm='---uuux'), Clausula(rhythm_name='double_molossus_cretic_resolved_h', rhythm='-u---ux'), Clausula(rhythm_name='double_trochee', rhythm='-u-x'), Clausula(rhythm_name='double_trochee_resolved_a', rhythm='uuu-x'), Clausula(rhythm_name='double_trochee_resolved_b', rhythm='-uuux'), Clausula(rhythm_name='hypodochmiac', rhythm='-u-ux'), Clausula(rhythm_name='hypodochmiac_resolved_a', rhythm='uuu-ux'), Clausula(rhythm_name='hypodochmiac_resolved_b', rhythm='-uuuux'), Clausula(rhythm_name='spondaic', rhythm='---x'), Clausula(rhythm_name='heroic', rhythm='-uu-x')])[source]¶ Bases:
object
-
clausulae_analysis
(prosody)[source]¶ Return dictionary in which the key is a type of clausula and the value is its frequency. :type prosody:
List
:param prosody: the prosody of a prose text (must be in the format of the scansion produced by the scanner classes. :rtype:List
[Dict
[str
,int
]] :return: dictionary of prosody >>> Clausulae().clausulae_analysis([‘-uuu-uuu-u–x’, ‘uu-uu-uu—-x’]) [{‘cretic_trochee’: 1}, {‘cretic_trochee_resolved_a’: 0}, {‘cretic_trochee_resolved_b’: 0}, {‘cretic_trochee_resolved_c’: 0}, {‘double_cretic’: 0}, {‘molossus_cretic’: 0}, {‘double_molossus_cretic_resolved_a’: 0}, {‘double_molossus_cretic_resolved_b’: 0}, {‘double_molossus_cretic_resolved_c’: 0}, {‘double_molossus_cretic_resolved_d’: 0}, {‘double_molossus_cretic_resolved_e’: 0}, {‘double_molossus_cretic_resolved_f’: 0}, {‘double_molossus_cretic_resolved_g’: 0}, {‘double_molossus_cretic_resolved_h’: 0}, {‘double_trochee’: 0}, {‘double_trochee_resolved_a’: 0}, {‘double_trochee_resolved_b’: 0}, {‘hypodochmiac’: 0}, {‘hypodochmiac_resolved_a’: 0}, {‘hypodochmiac_resolved_b’: 0}, {‘spondaic’: 1}, {‘heroic’: 0}]
-
8.1.12.1.1.3. cltk.prosody.lat.hendecasyllable_scanner module¶
Utility class for producing a scansion pattern for a Latin hendecasyllables.
Given a line of hendecasyllables, the scan method performs a series of transformation and checks are performed and for each one performed successfully, a note is added to the scansion_notes list so that end users may view the provenance of a scansion.
-
class
cltk.prosody.lat.hendecasyllable_scanner.
HendecasyllableScanner
(constants=<cltk.prosody.lat.scansion_constants.ScansionConstants object>, syllabifier=<cltk.prosody.lat.syllabifier.Syllabifier object>, optional_tranform=False, *args, **kwargs)[source]¶ Bases:
cltk.prosody.lat.verse_scanner.VerseScanner
The scansion symbols used can be configured by passing a suitable constants class to the constructor.
-
scan
(original_line, optional_transform=False)[source]¶ Scan a line of Latin hendecasyllables and produce a scansion pattern, and other data.
- Parameters
original_line (
str
) – the original line of Latin verseoptional_transform (
bool
) – whether or not to perform i to j transform for syllabification
- Return type
- Returns
a Verse object
>>> scanner = HendecasyllableScanner() >>> print(scanner.scan("Cui dono lepidum novum libellum")) Verse(original='Cui dono lepidum novum libellum', scansion=' - U - U U - U - U - U ', meter='hendecasyllable', valid=True, syllable_count=11, accented='Cui donō lepidūm novūm libēllum', scansion_notes=['Corrected invalid start.'], syllables = ['Cui', 'do', 'no', 'le', 'pi', 'dūm', 'no', 'vūm', 'li', 'bēl', 'lum']) >>> print(scanner.scan( ... "ārida modo pumice expolitum?").scansion) - U - U U - U - U - U
-
correct_invalid_start
(scansion)[source]¶ The third syllable of a hendecasyllabic line is long, so we will convert it.
- Parameters
scansion (
str
) – scansion string- Return type
str
- Returns
scansion string with corrected start
>>> print(HendecasyllableScanner().correct_invalid_start( ... "- U U U U - U - U - U").strip()) - U - U U - U - U - U
-
correct_antepenult_chain
(scansion)[source]¶ For hendecasyllables the last three feet of the verse are predictable and do not regularly allow substitutions.
- Parameters
scansion (
str
) – scansion line thus far- Return type
str
- Returns
corrected line of scansion
>>> print(HendecasyllableScanner().correct_antepenult_chain( ... "-U -UU UU UU UX").strip()) -U -UU -U -U -X
-
8.1.12.1.1.4. cltk.prosody.lat.hexameter_scanner module¶
Utility class for producing a scansion pattern for a Latin hexameter.
Given a line of hexameter, the scan method performs a series of transformation and checks are performed and for each one performed successfully, a note is added to the scansion_notes list so that end users may view the provenance of a scansion.
Because hexameters have strict rules on the position and quantity of stressed and unstressed syllables, we can often infer the many stress qualities of the syllables, given a valid hexameter. If the Latin hexameter provided is not accented with macrons, then a best guess is made. For the scansion produced, the stress of a dipthong is indicated in the second of the two vowel positions; for the accented line produced, the dipthong stress is not indicated with any macronized vowels.
-
class
cltk.prosody.lat.hexameter_scanner.
HexameterScanner
(constants=<cltk.prosody.lat.scansion_constants.ScansionConstants object>, syllabifier=<cltk.prosody.lat.syllabifier.Syllabifier object>, optional_transform=False, *args, **kwargs)[source]¶ Bases:
cltk.prosody.lat.verse_scanner.VerseScanner
The scansion symbols used can be configured by passing a suitable constants class to the constructor.
-
scan
(original_line, optional_transform=False, dactyl_smoothing=False)[source]¶ Scan a line of Latin hexameter and produce a scansion pattern, and other data.
- Parameters
original_line (
str
) – the original line of Latin verseoptional_transform (
bool
) – whether or not to perform i to j transform for syllabificationdactyl_smoothing (
bool
) – whether or not to perform dactyl smoothing
- Return type
- Returns
a Verse object
>>> scanner = HexameterScanner()
>>> print(HexameterScanner().scan( ... "ēxiguām sedēm pariturae tērra negavit").scansion) - - - - - U U - - - U U - U >>> print(scanner.scan("impulerit. Tantaene animis caelestibus irae?")) Verse(original='impulerit. Tantaene animis caelestibus irae?', scansion='- U U - - - U U - - - U U - - ', meter='hexameter', valid=True, syllable_count=15, accented='īmpulerīt. Tāntaene animīs caelēstibus īrae?', scansion_notes=['Valid by positional stresses.'], syllables = ['īm', 'pu', 'le', 'rīt', 'Tān', 'taen', 'a', 'ni', 'mīs', 'cae', 'lēs', 'ti', 'bus', 'i', 'rae']) >>> print(scanner.scan( ... "Arma virumque cano, Troiae qui prīmus ab ōrīs").scansion) - U U - U U - - - - - U U - - >>> # some hexameters need the optional transformations: >>> optional_transform_scanner = HexameterScanner(optional_transform=True) >>> print(optional_transform_scanner.scan( ... "Ītaliam, fāto profugus, Lāvīniaque vēnit").scansion) - - - - - U U - - - U U - U >>> print(HexameterScanner().scan( ... "lītora, multum ille et terrīs iactātus et alto").scansion) - U U - - - - - - - U U - U >>> print(HexameterScanner().scan( ... "vī superum saevae memorem Iūnōnis ob īram;").scansion) - U U - - - U U - - - U U - U >>> # handle multiple elisions >>> print(scanner.scan("monstrum horrendum, informe, ingens, cui lumen ademptum").scansion) - - - - - - - - - U U - U >>> # if we have 17 syllables, create a chain of all dactyls >>> print(scanner.scan("quadrupedante putrem sonitu quatit ungula campum" ... ).scansion) - U U - U U - U U - U U - U U - U >>> # if we have 13 syllables exactly, we'll create a spondaic hexameter >>> print(HexameterScanner().scan( ... "illi inter sese multa vi bracchia tollunt").scansion) - - - - - - - - - UU - - >>> print(HexameterScanner().scan( ... "dat latus; insequitur cumulo praeruptus aquae mons").scansion) - U U - U U - U U - - - U U - - >>> print(optional_transform_scanner.scan( ... "Non quivis videt inmodulata poëmata iudex").scansion) - - - U U - U U - U U- U U - - >>> print(HexameterScanner().scan( ... "certabant urbem Romam Remoramne vocarent").scansion) - - - - - - - U U - U U - - >>> # advanced smoothing is available via keyword flags: dactyl_smoothing >>> # print(HexameterScanner().scan( #... "his verbis: 'o gnata, tibi sunt ante ferendae", #... dactyl_smoothing=True).scansion) #- - - - - U U - - - U U - -
-
correct_invalid_fifth_foot
(scansion)[source]¶ The ‘inverted amphibrach’: stressed_unstressed_stressed syllable pattern is invalid in hexameters, so here we coerce it to stressed when it occurs at the end of a line
- Parameters
scansion (
str
) – the scansion pattern- Return corrected scansion
the corrected scansion pattern
>>> print(HexameterScanner().correct_invalid_fifth_foot( ... " - - - U U - U U U - - U U U - x")) - - - U U - U U U - - - U U - x
- Return type
str
-
invalid_foot_to_spondee
(feet, foot, idx)[source]¶ In hexameters, a single foot that is a unstressed_stressed syllable pattern is often just a double spondee, so here we coerce it to stressed.
- Parameters
feet (
list
) – list of string representations of meterical feetfoot (
str
) – the bad foot to correctidx (
int
) – the index of the foot to correct
- Return type
str
- Returns
corrected scansion
>>> print(HexameterScanner().invalid_foot_to_spondee( ... ['-UU', '--', '-U', 'U-', '--', '-UU'],'-U', 2)) -UU----U----UU
-
correct_dactyl_chain
(scansion)[source]¶ Three or more unstressed accents in a row is a broken dactyl chain, best detected and processed backwards.
Since this method takes a Procrustean approach to modifying the scansion pattern, it is not used by default in the scan method; however, it is available as an optional keyword parameter, and users looking to further automate the generation of scansion candidates should consider using this as a fall back.
- Parameters
scansion (
str
) – scansion with broken dactyl chain; inverted amphibrachs not allowed- Return type
str
- Returns
corrected line of scansion
>>> print(HexameterScanner().correct_dactyl_chain( ... "- U U - - U U - - - U U - x")) - - - - - U U - - - U U - x >>> print(HexameterScanner().correct_dactyl_chain( ... "- U U U U - - - - - U U - U")) - - - U U - - - - - U U - U
-
correct_inverted_amphibrachs
(scansion)[source]¶ The ‘inverted amphibrach’: stressed_unstressed_stressed syllable pattern is invalid in hexameters, so here we coerce it to stressed: - U - -> - - -
- Parameters
scansion (
str
) – the scansion stress pattern- Return type
str
- Returns
a string with the corrected scansion pattern
>>> print(HexameterScanner().correct_inverted_amphibrachs( ... " - U - - U - U U U U - U - x")) - - - - - - U U U U - - - x >>> print(HexameterScanner().correct_inverted_amphibrachs( ... " - - - U - - U U U U U- - U - x")) - - - - - - U U U U U- - - - x >>> print(HexameterScanner().correct_inverted_amphibrachs( ... "- - - - - U - U U - U U - -")) - - - - - - - U U - U U - - >>> print(HexameterScanner().correct_inverted_amphibrachs( ... "- UU- U - U - - U U U U- U")) - UU- - - - - - U U U U- U
-
8.1.12.1.1.5. cltk.prosody.lat.macronizer module¶
Delineate length of lat vowels.
The Macronizer class places a macron over naturally long Latin vowels. To discern whether a vowel is long, a word is first matched with its Morpheus entry by way of its POS tag. The Morpheus entry includes the macronized form of the matched word.
Since the accuracy of the macronizer largely derives from the accuracy of the POS tagger used to match words to their Morpheus entry, the Macronizer class allows for multiple POS to be used.
Todo
Determine how to disambiguate tags (see logger)
-
class
cltk.prosody.lat.macronizer.
Macronizer
(tagger)[source]¶ Bases:
object
Macronize Latin words.
Macronize text by using the POS tag to find the macronized form within the Morpheus database.
-
_retrieve_tag
(text)[source]¶ Tag text with chosen tagger and clean tags.
Tag format:
[('word', 'tag')]
- Parameters
text – string
- Returns
list of tuples, with each tuple containing the word and its pos tag
:rtype : list
-
_retrieve_morpheus_entry
(word)[source]¶ Return Morpheus entry for word
Entry format:
[(head word, tag, macronized form)]
- Parameters
word – unmacronized, lowercased word
- Ptype word
string
- Returns
Morpheus entry in tuples
:rtype : list
-
_macronize_word
(word)[source]¶ Return macronized word.
- Parameters
word – (word, tag)
- Ptype word
tuple
- Returns
(word, tag, macronized_form)
:rtype : tuple
Return macronized form along with POS tags.
E.g. “Gallia est omnis divisa in partes tres,” -> [(‘gallia’, ‘n-s—fb-‘, ‘galliā’), (‘est’, ‘v3spia—’, ‘est’), (‘omnis’, ‘a-s—mn-‘, ‘omnis’), (‘divisa’, ‘t-prppnn-‘, ‘dīvīsa’), (‘in’, ‘r——–’, ‘in’), (‘partes’, ‘n-p—fa-‘, ‘partēs’), (‘tres’, ‘m——–’, ‘trēs’)]
- Parameters
text – raw text
- Returns
tuples of head word, tag, macronized form
:rtype : list
-
8.1.12.1.1.6. cltk.prosody.lat.metrical_validator module¶
Utility class for validating scansion patterns: hexameter, hendecasyllables, pentameter. Allows users to configure the scansion symbols internally via a constructor argument; a suitable default is provided.
-
class
cltk.prosody.lat.metrical_validator.
MetricalValidator
(constants=<cltk.prosody.lat.scansion_constants.ScansionConstants object>)[source]¶ Bases:
object
Currently supports validation for: hexameter, hendecasyllables, pentameter.
-
is_valid_hexameter
(scanned_line)[source]¶ Determine if a scansion pattern is one of the valid hexameter metrical patterns :type scanned_line:
str
:param scanned_line: a line containing a sequence of stressed and unstressed syllables :return bool>>> print(MetricalValidator().is_valid_hexameter("-UU---UU---UU-U")) True
- Return type
bool
-
is_valid_hendecasyllables
(scanned_line)[source]¶ Determine if a scansion pattern is one of the valid Hendecasyllables metrical patterns
- Parameters
scanned_line (
str
) – a line containing a sequence of stressed and unstressed syllables
>>> print(MetricalValidator().is_valid_hendecasyllables("-U-UU-U-U-U")) True
- Return type
bool
-
is_valid_pentameter
(scanned_line)[source]¶ Determine if a scansion pattern is one of the valid Pentameter metrical patterns
- Parameters
scanned_line (
str
) – a line containing a sequence of stressed and unstressed syllables- Return bool
whether or not the scansion is a valid pentameter
>>> print(MetricalValidator().is_valid_pentameter('-UU-UU--UU-UUX')) True
- Return type
bool
-
hexameter_feet
(scansion)[source]¶ Produces a list of hexameter feet, stressed and unstressed syllables with spaces intact. If the scansion line is not entirely correct, it will attempt to corral one or more improper patterns into one or more feet.
- Param
scansion: the scanned line
:return list of strings, representing the feet of the hexameter, or if the scansion is wildly incorrect, the function will return an empty list.
>>> print("|".join(MetricalValidator().hexameter_feet( ... "- U U - - - - - - - U U - U")).strip() ) - U U |- - |- - |- - |- U U |- U >>> print("|".join(MetricalValidator().hexameter_feet( ... "- U U - - U - - - - U U - U")).strip()) - U U |- - |U - |- - |- U U |- U
- Return type
List
[str
]
-
static
hexameter_known_stresses
()[source]¶ Provide a list of known stress positions for a hexameter.
- Return type
List
[int
]- Returns
a zero based list enumerating which syllables are known to be stressed.
-
static
hexameter_possible_unstresses
()[source]¶ Provide a list of possible positions which may be unstressed syllables in a hexameter. :rtype:
List
[int
] :return: a zero based list enumerating which syllables are known to be unstressed.
-
closest_hexameter_patterns
(scansion)[source]¶ Find the closest group of matching valid hexameter patterns.
- Return type
List
[str
]- Returns
list of the closest valid hexameter patterns; only candidates with a matching
length/number of syllables are considered.
>>> print(MetricalValidator().closest_hexameter_patterns('-UUUUU-----UU--')) ['-UU-UU-----UU--']
-
static
pentameter_possible_stresses
()[source]¶ Provide a list of possible stress positions for a hexameter.
- Return type
List
[int
]- Returns
a zero based list enumerating which syllables are known to be stressed.
-
closest_pentameter_patterns
(scansion)[source]¶ Find the closest group of matching valid pentameter patterns.
- Return type
List
[str
]- Returns
list of the closest valid pentameter patterns; only candidates with a matching
length/number of syllables are considered.
>>> print(MetricalValidator().closest_pentameter_patterns('--UUU--UU-UUX')) ['---UU--UU-UUX']
-
closest_hendecasyllable_patterns
(scansion)[source]¶ Find the closest group of matching valid hendecasyllable patterns.
- Return type
List
[str
]- Returns
list of the closest valid hendecasyllable patterns; only candidates with a matching
length/number of syllables are considered.
>>> print(MetricalValidator().closest_hendecasyllable_patterns('UU-UU-U-U-X')) ['-U-UU-U-U-X', 'U--UU-U-U-X']
-
_closest_patterns
(patterns, scansion)[source]¶ Find the closest group of matching valid patterns.
- Patterns
a list of patterns
- Scansion
the scansion pattern thus far
- Return type
List
[str
]- Returns
list of the closest valid patterns; only candidates with a matching
length/number of syllables are considered.
-
_build_hexameter_template
(stress_positions)[source]¶ Build a hexameter scansion template from string of 5 binary numbers; NOTE: Traditionally the fifth foot is dactyl and spondee substitution is rare, however since it is a possible combination, we include it here.
- Parameters
stress_positions (
str
) – 5 binary integers, indicating whether foot is dactyl or spondee- Return type
str
- Returns
a valid hexameter scansion template, a string representing stressed and
unstresssed syllables with the optional terminal ending.
>>> print(MetricalValidator()._build_hexameter_template("01010")) -UU---UU---UU-X
-
8.1.12.1.1.7. cltk.prosody.lat.pentameter_scanner module¶
Utility class for producing a scansion pattern for a Latin pentameter.
Given a line of pentameter, the scan method performs a series of transformation and checks are performed, and for each one performed successfully, a note is added to the scansion_notes list so that end users may view the provenance of a scansion.
-
class
cltk.prosody.lat.pentameter_scanner.
PentameterScanner
(constants=<cltk.prosody.lat.scansion_constants.ScansionConstants object>, syllabifier=<cltk.prosody.lat.syllabifier.Syllabifier object>, optional_transform=False, *args, **kwargs)[source]¶ Bases:
cltk.prosody.lat.verse_scanner.VerseScanner
The scansion symbols used can be configured by passing a suitable constants class to the constructor.
-
scan
(original_line, optional_transform=False)[source]¶ Scan a line of Latin pentameter and produce a scansion pattern, and other data.
- Parameters
original_line (
str
) – the original line of Latin verseoptional_transform (
bool
) – whether or not to perform i to j transform for syllabification
- Return type
- Returns
a Verse object
>>> scanner = PentameterScanner() >>> print(scanner.scan('ex hoc ingrato gaudia amore tibi.')) Verse(original='ex hoc ingrato gaudia amore tibi.', scansion='- - - - - - U U - U U U ', meter='pentameter', valid=True, syllable_count=12, accented='ēx hōc īngrātō gaudia amōre tibi.', scansion_notes=['Spondaic pentameter'], syllables = ['ēx', 'hoc', 'īn', 'gra', 'to', 'gau', 'di', 'a', 'mo', 're', 'ti', 'bi']) >>> print(scanner.scan( ... "in vento et rapida scribere oportet aqua.").scansion) - - - U U - - U U - U U U
-
make_spondaic
(scansion)[source]¶ If a pentameter line has 12 syllables, then it must start with double spondees.
- Parameters
scansion (
str
) – a string of scansion patterns- Return type
str
- Returns
a scansion pattern string starting with two spondees
>>> print(PentameterScanner().make_spondaic("U U U U U U U U U U U U")) - - - - - - U U - U U U
-
make_dactyls
(scansion)[source]¶ If a pentameter line has 14 syllables, it starts and ends with double dactyls.
- Parameters
scansion (
str
) – a string of scansion patterns- Return type
str
- Returns
a scansion pattern string starting and ending with double dactyls
>>> print(PentameterScanner().make_dactyls("U U U U U U U U U U U U U U")) - U U - U U - - U U - U U U
-
correct_penultimate_dactyl_chain
(scansion)[source]¶ For pentameter the last two feet of the verse are predictable dactyls, and do not regularly allow substitutions.
- Parameters
scansion (
str
) – scansion line thus far- Return type
str
- Returns
corrected line of scansion
>>> print(PentameterScanner().correct_penultimate_dactyl_chain( ... "U U U U U U U U U U U U U U")) U U U U U U U - U U - U U U
-
8.1.12.1.1.8. cltk.prosody.lat.scanner module¶
Scansion module for scanning Latin prose rhythms.
-
class
cltk.prosody.lat.scanner.
Scansion
(punctuation=None, clausula_length=13, elide=True)[source]¶ Bases:
object
Prepossesses Latin text for prose rhythm analysis.
-
SHORT_VOWELS
= ['a', 'e', 'i', 'o', 'u', 'y']¶
-
LONG_VOWELS
= ['ā', 'ē', 'ī', 'ō', 'ū']¶
-
VOWELS
= ['a', 'e', 'i', 'o', 'u', 'y', 'ā', 'ē', 'ī', 'ō', 'ū']¶
-
DIPHTHONGS
= ['ae', 'au', 'ei', 'oe', 'ui']¶
-
SINGLE_CONSONANTS
= ['b', 'c', 'd', 'g', 'k', 'l', 'm', 'n', 'p', 'q', 'r', 's', 't', 'v', 'f', 'j']¶
-
DOUBLE_CONSONANTS
= ['x', 'z']¶
-
CONSONANTS
= ['b', 'c', 'd', 'g', 'k', 'l', 'm', 'n', 'p', 'q', 'r', 's', 't', 'v', 'f', 'j', 'x', 'z']¶
-
DIGRAPHS
= ['ch', 'ph', 'th', 'qu']¶
-
LIQUIDS
= ['r', 'l']¶
-
MUTES
= ['b', 'p', 'd', 't', 'c', 'g']¶
-
MUTE_LIQUID_EXCEPTIONS
= ['gl', 'bl']¶
-
NASALS
= ['m', 'n']¶
-
SESTS
= ['sc', 'sm', 'sp', 'st', 'z']¶
-
_tokenize_syllables
(word)[source]¶ Tokenize syllables for word. “mihi” -> [{“syllable”: “mi”, index: 0, … } … ] Syllable properties:
syllable: string -> syllable index: int -> postion in word long_by_nature: bool -> is syllable long by nature accented: bool -> does receive accent long_by_position: bool -> is syllable long by position
- Parameters
word (
str
) – string- Return type
List
[Dict
]- Returns
list
>>> Scansion()._tokenize_syllables("mihi") [{'syllable': 'mi', 'index': 0, 'elide': (False, None), 'long_by_nature': False, 'long_by_position': (False, None), 'accented': True}, {'syllable': 'hi', 'index': 1, 'elide': (False, None), 'long_by_nature': False, 'long_by_position': (False, None), 'accented': False}] >>> Scansion()._tokenize_syllables("ivi") [{'syllable': 'i', 'index': 0, 'elide': (False, None), 'long_by_nature': False, 'long_by_position': (False, None), 'accented': True}, {'syllable': 'vi', 'index': 1, 'elide': (False, None), 'long_by_nature': False, 'long_by_position': (False, None), 'accented': False}] >>> Scansion()._tokenize_syllables("audītū") [{'syllable': 'au', 'index': 0, 'elide': (False, None), 'long_by_nature': True, 'long_by_position': (False, None), 'accented': False}, {'syllable': 'dī', 'index': 1, 'elide': (False, None), 'long_by_nature': True, 'long_by_position': (False, None), 'accented': True}, {'syllable': 'tū', 'index': 2, 'elide': (False, None), 'long_by_nature': True, 'long_by_position': (False, None), 'accented': False}] >>> Scansion()._tokenize_syllables("ā") [{'syllable': 'ā', 'index': 0, 'elide': (False, None), 'long_by_nature': True, 'long_by_position': (False, None), 'accented': True}] >>> Scansion()._tokenize_syllables("conjiciō") [{'syllable': 'con', 'index': 0, 'elide': (False, None), 'long_by_nature': False, 'long_by_position': (True, None), 'accented': False}, {'syllable': 'ji', 'index': 1, 'elide': (False, None), 'long_by_nature': False, 'long_by_position': (False, None), 'accented': True}, {'syllable': 'ci', 'index': 2, 'elide': (False, None), 'long_by_nature': False, 'long_by_position': (False, None), 'accented': False}, {'syllable': 'ō', 'index': 3, 'elide': (False, None), 'long_by_nature': True, 'long_by_position': (False, None), 'accented': False}] >>> Scansion()._tokenize_syllables("lingua") [{'syllable': 'lin', 'index': 0, 'elide': (False, None), 'long_by_nature': False, 'long_by_position': (True, None), 'accented': True}, {'syllable': 'gua', 'index': 1, 'elide': (False, None), 'long_by_nature': False, 'long_by_position': (False, None), 'accented': False}] >>> Scansion()._tokenize_syllables("abrante") [{'syllable': 'ab', 'index': 0, 'elide': (False, None), 'long_by_nature': False, 'long_by_position': (False, 'mute+liquid'), 'accented': False}, {'syllable': 'ran', 'index': 1, 'elide': (False, None), 'long_by_nature': False, 'long_by_position': (True, None), 'accented': True}, {'syllable': 'te', 'index': 2, 'elide': (False, None), 'long_by_nature': False, 'long_by_position': (False, None), 'accented': False}] >>> Scansion()._tokenize_syllables("redemptor") [{'syllable': 'red', 'index': 0, 'elide': (False, None), 'long_by_nature': False, 'long_by_position': (False, None), 'accented': False}, {'syllable': 'em', 'index': 1, 'elide': (False, None), 'long_by_nature': False, 'long_by_position': (True, None), 'accented': True}, {'syllable': 'ptor', 'index': 2, 'elide': (False, None), 'long_by_nature': False, 'long_by_position': (False, None), 'accented': False}] >>> Scansion()._tokenize_syllables("nagrante") [{'syllable': 'na', 'index': 0, 'elide': (False, None), 'long_by_nature': False, 'long_by_position': (False, 'mute+liquid'), 'accented': False}, {'syllable': 'gran', 'index': 1, 'elide': (False, None), 'long_by_nature': False, 'long_by_position': (True, None), 'accented': True}, {'syllable': 'te', 'index': 2, 'elide': (False, None), 'long_by_nature': False, 'long_by_position': (False, None), 'accented': False}]
-
_tokenize_words
(sentence)[source]¶ Tokenize words for sentence. “Puella bona est” -> [{word: puella, index: 0, … }, … ] Word properties:
word: string -> word index: int -> position in sentence syllables: list -> list of syllable objects syllables_count: int -> number of syllables in word
- Parameters
sentence (
str
) – string- Return type
List
[Dict
]- Returns
list
>>> Scansion()._tokenize_words('dedērunt te miror antōnī quorum.') [{'word': 'dedērunt', 'index': 0, 'syllables': [{'syllable': 'de', 'index': 0, 'elide': (False, None), 'long_by_nature': False, 'long_by_position': (False, None), 'accented': False}, {'syllable': 'dē', 'index': 1, 'elide': (False, None), 'long_by_nature': True, 'long_by_position': (False, None), 'accented': True}, {'syllable': 'runt', 'index': 2, 'elide': (False, None), 'long_by_nature': False, 'long_by_position': (True, None), 'accented': False}], 'syllables_count': 3}, {'word': 'te', 'index': 1, 'syllables': [{'syllable': 'te', 'index': 0, 'elide': (False, None), 'long_by_nature': False, 'long_by_position': (False, None), 'accented': True}], 'syllables_count': 1}, {'word': 'miror', 'index': 2, 'syllables': [{'syllable': 'mi', 'index': 0, 'elide': (False, None), 'long_by_nature': False, 'long_by_position': (False, None), 'accented': True}, {'syllable': 'ror', 'index': 1, 'elide': (False, None), 'long_by_nature': False, 'long_by_position': (False, None), 'accented': False}], 'syllables_count': 2}, {'word': 'antōnī', 'index': 3, 'syllables': [{'syllable': 'an', 'index': 0, 'elide': (False, None), 'long_by_nature': False, 'long_by_position': (True, None), 'accented': False}, {'syllable': 'tō', 'index': 1, 'elide': (False, None), 'long_by_nature': True, 'long_by_position': (False, None), 'accented': True}, {'syllable': 'nī', 'index': 2, 'elide': (False, None), 'long_by_nature': True, 'long_by_position': (False, None), 'accented': False}], 'syllables_count': 3}, {'word': 'quorum.', 'index': 4, 'syllables': [{'syllable': 'quo', 'index': 0, 'elide': (False, None), 'long_by_nature': False, 'long_by_position': (False, None), 'accented': True}, {'syllable': 'rum', 'index': 1, 'elide': (False, None), 'long_by_nature': False, 'long_by_position': (False, None), 'accented': False}], 'syllables_count': 2}] >>> Scansion()._tokenize_words('a spes co i no xe cta.') [{'word': 'a', 'index': 0, 'syllables': [{'syllable': 'a', 'index': 0, 'elide': (False, None), 'long_by_nature': False, 'long_by_position': (False, 'sest'), 'accented': True}], 'syllables_count': 1}, {'word': 'spes', 'index': 1, 'syllables': [{'syllable': 'spes', 'index': 0, 'elide': (False, None), 'long_by_nature': False, 'long_by_position': (True, None), 'accented': True}], 'syllables_count': 1}, {'word': 'co', 'index': 2, 'syllables': [{'syllable': 'co', 'index': 0, 'elide': (True, 'weak'), 'long_by_nature': False, 'long_by_position': (False, None), 'accented': True}], 'syllables_count': 1}, {'word': 'i', 'index': 3, 'syllables': [{'syllable': 'i', 'index': 0, 'elide': (False, None), 'long_by_nature': False, 'long_by_position': (False, None), 'accented': True}], 'syllables_count': 1}, {'word': 'no', 'index': 4, 'syllables': [{'syllable': 'no', 'index': 0, 'elide': (False, None), 'long_by_nature': False, 'long_by_position': (True, None), 'accented': True}], 'syllables_count': 1}, {'word': 'xe', 'index': 5, 'syllables': [{'syllable': 'xe', 'index': 0, 'elide': (False, None), 'long_by_nature': False, 'long_by_position': (True, None), 'accented': True}], 'syllables_count': 1}, {'word': 'cta.', 'index': 6, 'syllables': [{'syllable': 'cta', 'index': 0, 'elide': (False, None), 'long_by_nature': False, 'long_by_position': (False, None), 'accented': True}], 'syllables_count': 1}] >>> Scansion()._tokenize_words('x') [] >>> Scansion()._tokenize_words('atae amo.') [{'word': 'atae', 'index': 0, 'syllables': [{'syllable': 'a', 'index': 0, 'elide': (False, None), 'long_by_nature': False, 'long_by_position': (False, None), 'accented': True}, {'syllable': 'tae', 'index': 1, 'elide': (True, 'strong'), 'long_by_nature': True, 'long_by_position': (False, None), 'accented': False}], 'syllables_count': 2}, {'word': 'amo.', 'index': 1, 'syllables': [{'syllable': 'a', 'index': 0, 'elide': (False, None), 'long_by_nature': False, 'long_by_position': (False, None), 'accented': True}, {'syllable': 'mo', 'index': 1, 'elide': (False, None), 'long_by_nature': False, 'long_by_position': (False, None), 'accented': False}], 'syllables_count': 2}] >>> Scansion()._tokenize_words('bar rid.') [{'word': 'bar', 'index': 0, 'syllables': [{'syllable': 'bar', 'index': 0, 'elide': (False, None), 'long_by_nature': False, 'long_by_position': (True, None), 'accented': True}], 'syllables_count': 1}, {'word': 'rid.', 'index': 1, 'syllables': [{'syllable': 'rid', 'index': 0, 'elide': (False, None), 'long_by_nature': False, 'long_by_position': (False, None), 'accented': True}], 'syllables_count': 1}] >>> Scansion()._tokenize_words('ba brid.') [{'word': 'ba', 'index': 0, 'syllables': [{'syllable': 'ba', 'index': 0, 'elide': (False, None), 'long_by_nature': False, 'long_by_position': (False, 'mute+liquid'), 'accented': True}], 'syllables_count': 1}, {'word': 'brid.', 'index': 1, 'syllables': [{'syllable': 'brid', 'index': 0, 'elide': (False, None), 'long_by_nature': False, 'long_by_position': (False, None), 'accented': True}], 'syllables_count': 1}]
-
tokenize
(text)[source]¶ Tokenize text on supplied characters. “Puella bona est. Puer malus est.” -> [ [{word: puella, syllables: […], index: 0}, … ], … ] :rtype:
List
[Dict
] :return:list>>> Scansion().tokenize('puella bona est. puer malus est.') [{'plain_text_sentence': 'puella bona est', 'structured_sentence': [{'word': 'puella', 'index': 0, 'syllables': [{'syllable': 'pu', 'index': 0, 'elide': (False, None), 'long_by_nature': False, 'long_by_position': (False, None), 'accented': False}, {'syllable': 'el', 'index': 1, 'elide': (False, None), 'long_by_nature': False, 'long_by_position': (True, None), 'accented': True}, {'syllable': 'la', 'index': 2, 'elide': (False, None), 'long_by_nature': False, 'long_by_position': (False, None), 'accented': False}], 'syllables_count': 3}, {'word': 'bona', 'index': 1, 'syllables': [{'syllable': 'bo', 'index': 0, 'elide': (False, None), 'long_by_nature': False, 'long_by_position': (False, None), 'accented': True}, {'syllable': 'na', 'index': 1, 'elide': (True, 'weak'), 'long_by_nature': False, 'long_by_position': (False, None), 'accented': False}], 'syllables_count': 2}, {'word': 'est', 'index': 2, 'syllables': [{'syllable': 'est', 'index': 0, 'elide': (False, None), 'long_by_nature': False, 'long_by_position': (True, None), 'accented': True}], 'syllables_count': 1}]}, {'plain_text_sentence': ' puer malus est', 'structured_sentence': [{'word': 'puer', 'index': 0, 'syllables': [{'syllable': 'pu', 'index': 0, 'elide': (False, None), 'long_by_nature': False, 'long_by_position': (False, None), 'accented': True}, {'syllable': 'er', 'index': 1, 'elide': (False, None), 'long_by_nature': False, 'long_by_position': (True, None), 'accented': False}], 'syllables_count': 2}, {'word': 'malus', 'index': 1, 'syllables': [{'syllable': 'ma', 'index': 0, 'elide': (False, None), 'long_by_nature': False, 'long_by_position': (False, None), 'accented': True}, {'syllable': 'lus', 'index': 1, 'elide': (False, None), 'long_by_nature': False, 'long_by_position': (False, None), 'accented': False}], 'syllables_count': 2}, {'word': 'est', 'index': 2, 'syllables': [{'syllable': 'est', 'index': 0, 'elide': (False, None), 'long_by_nature': False, 'long_by_position': (True, None), 'accented': True}], 'syllables_count': 1}]}, {'plain_text_sentence': '', 'structured_sentence': []}]
-
scan_text
(text)[source]¶ Return a flat list of rhythms. Desired clausula length is passed as a parameter. Clausula shorter than the specified length can be exluded. :rtype:
List
[str
] :return:>>> Scansion().scan_text('dedērunt te miror antōnī quorum. sī quid est in mē ingenī jūdicēs quod sentiō.') ['u--uuu---ux', 'u---u--u---ux']
-
8.1.12.1.1.9. cltk.prosody.lat.scansion_constants module¶
Configuration class for specifying scansion constants.
-
class
cltk.prosody.lat.scansion_constants.
ScansionConstants
(unstressed='U', stressed='-', optional_terminal_ending='X', separator='|')[source]¶ Bases:
object
Constants containing strings have characters in upper and lower case since they will often be used in regular expressions, and used to preserve/a verse’s original case.
This class also allows users to customizing scansion constants and scanner behavior.
>>> constants = ScansionConstants(unstressed="U",stressed= "-", optional_terminal_ending="X") >>> print(constants.DACTYL) -UU
>>> smaller_constants = ScansionConstants( ... unstressed="˘",stressed= "¯", optional_terminal_ending="x") >>> print(smaller_constants.DACTYL) ¯˘˘
-
HEXAMETER_ENDING
¶ The following two constants are not offical scansion terms, but invalid in hexameters
-
DOUBLED_CONSONANTS
¶ Prefix order not arbitrary; one will want to match on extra before ex
-
8.1.12.1.1.10. cltk.prosody.lat.scansion_formatter module¶
Utility class for formatting scansion patterns
-
class
cltk.prosody.lat.scansion_formatter.
ScansionFormatter
(constants=<cltk.prosody.lat.scansion_constants.ScansionConstants object>)[source]¶ Bases:
object
Users can specify which scansion symbols to use in the formatting.
>>> print(ScansionFormatter().hexameter( "-UU-UU-UU---UU--")) -UU|-UU|-UU|--|-UU|-- >>> constants = ScansionConstants(unstressed="˘", stressed= "¯", optional_terminal_ending="x") >>> formatter = ScansionFormatter(constants) >>> print(formatter.hexameter( "¯˘˘¯˘˘¯˘˘¯¯¯˘˘¯¯")) ¯˘˘|¯˘˘|¯˘˘|¯¯|¯˘˘|¯¯
-
hexameter
(line)[source]¶ Format a string of hexameter metrical stress patterns into foot divisions
- Parameters
line (
str
) – the scansion pattern- Return type
str
- Returns
the scansion string formatted with foot breaks
>>> print(ScansionFormatter().hexameter( "-UU-UU-UU---UU--")) -UU|-UU|-UU|--|-UU|--
-
merge_line_scansion
(line, scansion)[source]¶ Merge a line of verse with its scansion string. Do not accent dipthongs.
- Parameters
line (
str
) – the original Latin verse linescansion (
str
) – the scansion pattern
- Return type
str
- Returns
the original line with the scansion pattern applied via macrons
>>> print(ScansionFormatter().merge_line_scansion( ... "Arma virumque cano, Troiae qui prīmus ab ōrīs", ... "- U U - U U - UU- - - U U - -")) Ārma virūmque canō, Troiae quī prīmus ab ōrīs
>>> print(ScansionFormatter().merge_line_scansion( ... "lītora, multum ille et terrīs iactātus et alto", ... " - U U - - - - - - - U U - U")) lītora, mūltum īlle ēt tērrīs iāctātus et ālto
>>> print(ScansionFormatter().merge_line_scansion( ... 'aut facere, haec a te dictaque factaque sunt', ... ' - U U - - - - U U - U U - ')) aut facere, haec ā tē dīctaque fāctaque sūnt
-
8.1.12.1.1.11. cltk.prosody.lat.string_utils module¶
Utillity class for processing scansion and text.
-
cltk.prosody.lat.string_utils.
remove_punctuation_dict
()[source]¶ Provide a dictionary for removing punctuation, swallowing spaces.
:return dict with punctuation from the unicode table
>>> print("I'm ok! Oh #%&*()[]{}!? Fine!".translate( ... remove_punctuation_dict()).lstrip()) Im ok Oh Fine
- Return type
Dict
[int
,None
]
-
cltk.prosody.lat.string_utils.
punctuation_for_spaces_dict
()[source]¶ Provide a dictionary for removing punctuation, keeping spaces. Essential for scansion to keep stress patterns in alignment with original vowel positions in the verse.
:return dict with punctuation from the unicode table
>>> print("I'm ok! Oh #%&*()[]{}!? Fine!".translate( ... punctuation_for_spaces_dict()).strip()) I m ok Oh Fine
- Return type
Dict
[int
,str
]
-
cltk.prosody.lat.string_utils.
differences
(scansion, candidate)[source]¶ Given two strings, return a list of index positions where the contents differ.
- Parameters
scansion (
str
) –candidate (
str
) –
- Return type
List
[int
]- Returns
>>> differences("abc", "abz") [2]
-
cltk.prosody.lat.string_utils.
mark_list
(line)[source]¶ Given a string, return a list of index positions where a character/non blank space exists.
- Parameters
line (
str
) –- Return type
List
[int
]- Returns
>>> mark_list(" a b c") [1, 3, 5]
-
cltk.prosody.lat.string_utils.
space_list
(line)[source]¶ Given a string, return a list of index positions where a blank space occurs.
- Parameters
line (
str
) –- Return type
List
[int
]- Returns
>>> space_list(" abc ") [0, 1, 2, 3, 7]
-
cltk.prosody.lat.string_utils.
flatten
(list_of_lists)[source]¶ Given a list of lists, flatten all the items into one list.
- Parameters
list_of_lists –
- Returns
>>> flatten([ [1, 2, 3], [4, 5, 6]]) [1, 2, 3, 4, 5, 6]
-
cltk.prosody.lat.string_utils.
to_syllables_with_trailing_spaces
(line, syllables)[source]¶ Given a line of syllables and spaces, and a list of syllables, produce a list of the syllables with trailing spaces attached as approriate.
- Parameters
line (
str
) –syllables (
List
[str
]) –
- Return type
List
[str
]- Returns
>>> to_syllables_with_trailing_spaces(' arma virumque cano ', ... ['ar', 'ma', 'vi', 'rum', 'que', 'ca', 'no' ]) [' ar', 'ma ', 'vi', 'rum', 'que ', 'ca', 'no ']
-
cltk.prosody.lat.string_utils.
join_syllables_spaces
(syllables, spaces)[source]¶ Given a list of syllables, and a list of integers indicating the position of spaces, return a string that has a space inserted at the designated points.
- Parameters
syllables (
List
[str
]) –spaces (
List
[int
]) –
- Return type
str
- Returns
>>> join_syllables_spaces(["won", "to", "tree", "dun"], [3, 6, 11]) 'won to tree dun'
-
cltk.prosody.lat.string_utils.
starts_with_qu
(word)[source]¶ Determine whether or not a word start with the letters Q and U.
- Parameters
word –
- Return type
bool
- Returns
>>> starts_with_qu("qui") True >>> starts_with_qu("Quirites") True
-
cltk.prosody.lat.string_utils.
stress_positions
(stress, scansion)[source]¶ Given a stress value and a scansion line, return the index positions of the stresses.
- Parameters
stress (
str
) –scansion (
str
) –
- Return type
List
[int
]- Returns
>>> stress_positions("-", " - U U - UU - U U") [0, 3, 6]
-
cltk.prosody.lat.string_utils.
merge_elisions
(elided)[source]¶ Given a list of strings with different space swapping elisions applied, merge the elisions, taking the most without compounding the omissions.
- Parameters
elided (
List
[str
]) –- Return type
str
- Returns
>>> merge_elisions([ ... "ignavae agua multum hiatus", "ignav agua multum hiatus" ,"ignavae agua mult hiatus"]) 'ignav agua mult hiatus'
-
cltk.prosody.lat.string_utils.
move_consonant_right
(letters, positions)[source]¶ Given a list of letters, and a list of consonant positions, move the consonant positions to the right, merging strings as necessary.
- Parameters
letters (
List
[str
]) –positions (
List
[int
]) –
- Return type
List
[str
]- Returns
>>> move_consonant_right(list("abbra"), [ 2, 3]) ['a', 'b', '', '', 'bra']
-
cltk.prosody.lat.string_utils.
move_consonant_left
(letters, positions)[source]¶ Given a list of letters, and a list of consonant positions, move the consonant positions to the left, merging strings as necessary.
- Parameters
letters (
List
[str
]) –positions (
List
[int
]) –
- Return type
List
[str
]- Returns
>>> move_consonant_left(['a', 'b', '', '', 'bra'], [1]) ['ab', '', '', '', 'bra']
-
cltk.prosody.lat.string_utils.
merge_next
(letters, positions)[source]¶ Given a list of letter positions, merge each letter with its next neighbor.
- Parameters
letters (
List
[str
]) –positions (
List
[int
]) –
- Return type
List
[str
]- Returns
>>> merge_next(['a', 'b', 'o', 'v', 'o' ], [0, 2]) ['ab', '', 'ov', '', 'o'] >>> # Note: because it operates on the original list passed in, the effect is not cummulative: >>> merge_next(['a', 'b', 'o', 'v', 'o' ], [0, 2, 3]) ['ab', '', 'ov', 'o', '']
-
cltk.prosody.lat.string_utils.
remove_blanks
(letters)[source]¶ Given a list of letters, remove any empty strings.
- Parameters
letters (
List
[str
]) –- Returns
>>> remove_blanks(['a', '', 'b', '', 'c']) ['a', 'b', 'c']
-
cltk.prosody.lat.string_utils.
split_on
(word, section)[source]¶ Given a string, split on a section, and return the two sections as a tuple.
- Parameters
word (
str
) –section (
str
) –
- Return type
Tuple
[str
,str
]- Returns
>>> split_on('hamrye', 'ham') ('ham', 'rye')
-
cltk.prosody.lat.string_utils.
remove_blank_spaces
(syllables)[source]¶ Given a list of letters, remove any blank spaces or empty strings.
- Parameters
syllables (
List
[str
]) –- Return type
List
[str
]- Returns
>>> remove_blank_spaces(['', 'a', ' ', 'b', ' ', 'c', '']) ['a', 'b', 'c']
-
cltk.prosody.lat.string_utils.
overwrite
(char_list, regexp, quality, offset=0)[source]¶ Given a list of characters and spaces, a matching regular expression, and a quality or character, replace the matching character with a space, overwriting with an offset and a multiplier if provided.
- Parameters
char_list (
List
[str
]) –regexp (
str
) –quality (
str
) –offset (
int
) –
- Return type
List
[str
]- Returns
>>> overwrite(list("multe igne"), r"e\s[aeiou]", " ") ['m', 'u', 'l', 't', ' ', ' ', 'i', 'g', 'n', 'e']
-
cltk.prosody.lat.string_utils.
overwrite_dipthong
(char_list, regexp, quality)[source]¶ Given a list of characters and spaces, a matching regular expression, and a quality or character, replace the matching character with a space, overwriting with an offset and a multiplier if provided.
- Parameters
char_list (
List
[str
]) – a list of charactersregexp (
str
) – a matching regular expressionquality (
str
) – a quality or character to replace
- Return type
List
[str
]- Returns
a list of characters with the dipthong overwritten
>>> overwrite_dipthong(list("multae aguae"), r"ae\s[aeou]", " ") ['m', 'u', 'l', 't', ' ', ' ', ' ', 'a', 'g', 'u', 'a', 'e']
-
cltk.prosody.lat.string_utils.
get_unstresses
(stresses, count)[source]¶ Given a list of stressed positions, and count of possible positions, return a list of the unstressed positions.
- Parameters
stresses (
List
[int
]) – a list of stressed positionscount (
int
) – the number of possible positions
- Return type
List
[int
]- Returns
a list of unstressed positions
>>> get_unstresses([0, 3, 6, 9, 12, 15], 17) [1, 2, 4, 5, 7, 8, 10, 11, 13, 14, 16]
8.1.12.1.1.12. cltk.prosody.lat.syllabifier module¶
Latin language syllabifier. Parses a lat word or a space separated list of words into a list of syllables. Consonantal I is transformed into a J at the start of a word as necessary. Tuned for poetry and verse, this class is tolerant of isolated single character consonants that may appear due to elision.
-
class
cltk.prosody.lat.syllabifier.
Syllabifier
(constants=<cltk.prosody.lat.scansion_constants.ScansionConstants object>)[source]¶ Bases:
object
Scansion constants can be modified and passed into the constructor if desired.
-
syllabify
(words)[source]¶ Parse a Latin word into a list of syllable strings.
- Parameters
words (
str
) – a string containing one lat word or many words separated by spaces.- Return type
List
[str
]- Returns
list of string, each representing a syllable.
>>> syllabifier = Syllabifier() >>> print(syllabifier.syllabify("fuit")) ['fu', 'it'] >>> print(syllabifier.syllabify("libri")) ['li', 'bri'] >>> print(syllabifier.syllabify("contra")) ['con', 'tra'] >>> print(syllabifier.syllabify("iaculum")) ['ja', 'cu', 'lum'] >>> print(syllabifier.syllabify("amo")) ['a', 'mo'] >>> print(syllabifier.syllabify("bracchia")) ['brac', 'chi', 'a'] >>> print(syllabifier.syllabify("deinde")) ['dein', 'de'] >>> print(syllabifier.syllabify("certabant")) ['cer', 'ta', 'bant'] >>> print(syllabifier.syllabify("aere")) ['ae', 're'] >>> print(syllabifier.syllabify("adiungere")) ['ad', 'jun', 'ge', 're'] >>> print(syllabifier.syllabify("mōns")) ['mōns'] >>> print(syllabifier.syllabify("domus")) ['do', 'mus'] >>> print(syllabifier.syllabify("lixa")) ['li', 'xa'] >>> print(syllabifier.syllabify("asper")) ['as', 'per'] >>> # handle doubles >>> print(syllabifier.syllabify("siccus")) ['sic', 'cus'] >>> # handle liquid + liquid >>> print(syllabifier.syllabify("almus")) ['al', 'mus'] >>> # handle liquid + mute >>> print(syllabifier.syllabify("ambo")) ['am', 'bo'] >>> print(syllabifier.syllabify("anguis")) ['an', 'guis'] >>> print(syllabifier.syllabify("arbor")) ['ar', 'bor'] >>> print(syllabifier.syllabify("pulcher")) ['pul', 'cher'] >>> print(syllabifier.syllabify("ruptus")) ['ru', 'ptus'] >>> print(syllabifier.syllabify("Bīthÿnus")) ['Bī', 'thÿ', 'nus'] >>> print(syllabifier.syllabify("sanguen")) ['san', 'guen'] >>> print(syllabifier.syllabify("unguentum")) ['un', 'guen', 'tum'] >>> print(syllabifier.syllabify("lingua")) ['lin', 'gua'] >>> print(syllabifier.syllabify("linguā")) ['lin', 'guā'] >>> print(syllabifier.syllabify("languidus")) ['lan', 'gui', 'dus'] >>> print(syllabifier.syllabify("suis")) ['su', 'is'] >>> print(syllabifier.syllabify("habui")) ['ha', 'bu', 'i'] >>> print(syllabifier.syllabify("habuit")) ['ha', 'bu', 'it'] >>> print(syllabifier.syllabify("qui")) ['qui'] >>> print(syllabifier.syllabify("quibus")) ['qui', 'bus'] >>> print(syllabifier.syllabify("hui")) ['hui'] >>> print(syllabifier.syllabify("cui")) ['cui'] >>> print(syllabifier.syllabify("huic")) ['huic']
-
_setup
(word)[source]¶ Prepares a word for syllable processing.
If the word starts with a prefix, process it separately. :param word: :rtype:
List
[str
] :return:
-
_process
(word)[source]¶ Process a word into a list of strings representing the syllables of the word. This method describes rules for consonant grouping behaviors and then iteratively applies those rules the list of letters that comprise the word, until all the letters are grouped into appropriate syllable groups.
- Parameters
word (
str
) –- Return type
List
[str
]- Returns
-
_starting_consonants_only
(letters)[source]¶ Return a list of starting consonant positions.
- Return type
list
-
_ending_consonants_only
(letters)[source]¶ Return a list of positions for ending consonants.
- Return type
List
[int
]
-
_find_solo_consonant
(letters)[source]¶ Find the positions of any solo consonants that are not yet paired with a vowel.
- Return type
List
[int
]
-
_find_consonant_cluster
(letters)[source]¶ Find clusters of consonants that do not contain a vowel. :type letters:
List
[str
] :param letters: :rtype:List
[int
] :return:
-
_move_consonant
(letters, positions)[source]¶ Given a list of consonant positions, move the consonants according to certain consonant syllable behavioral rules for gathering and grouping.
- Parameters
letters (
list
) –positions (
List
[int
]) –
- Return type
List
[str
]- Returns
-
get_syllable_count
(syllables)[source]¶ Counts the number of syllable groups that would occur after ellision.
Often we will want preserve the position and separation of syllables so that they can be used to reconstitute a line, and apply stresses to the original word positions. However, we also want to be able to count the number of syllables accurately.
- Parameters
syllables (
List
[str
]) –- Return type
int
- Returns
>>> syllabifier = Syllabifier() >>> print(syllabifier.get_syllable_count([ ... 'Jām', 'tūm', 'c', 'au', 'sus', 'es', 'u', 'nus', 'I', 'ta', 'lo', 'rum'])) 11
-
8.1.12.1.1.13. cltk.prosody.lat.verse module¶
Data structure class for a line of metrical verse.
-
class
cltk.prosody.lat.verse.
Verse
(original, scansion='', meter=None, valid=False, syllable_count=0, accented='', scansion_notes=None, syllables=None)[source]¶ Bases:
object
Class representing a line of metrical verse.
This class is round-trippable; the __repr__ call can be used for construction.
>>> positional_hex = Verse(original='impulerit. Tantaene animis caelestibus irae?', ... scansion='- U U - - - U U - - - U U - - ', meter='hexameter', ... valid=True, syllable_count=15, accented='īmpulerīt. Tāntaene animīs caelēstibus īrae?', ... scansion_notes=['Valid by positional stresses.'], ... syllables = ['īm', 'pu', 'le', 'rīt', 'Tān', 'taen', 'a', 'ni', 'mīs', 'cae', 'lēs', 'ti', 'bus', 'i', 'rae']) >>> dupe = eval(positional_hex.__repr__()) >>> dupe Verse(original='impulerit. Tantaene animis caelestibus irae?', scansion='- U U - - - U U - - - U U - - ', meter='hexameter', valid=True, syllable_count=15, accented='īmpulerīt. Tāntaene animīs caelēstibus īrae?', scansion_notes=['Valid by positional stresses.'], syllables = ['īm', 'pu', 'le', 'rīt', 'Tān', 'taen', 'a', 'ni', 'mīs', 'cae', 'lēs', 'ti', 'bus', 'i', 'rae']) >>> positional_hex Verse(original='impulerit. Tantaene animis caelestibus irae?', scansion='- U U - - - U U - - - U U - - ', meter='hexameter', valid=True, syllable_count=15, accented='īmpulerīt. Tāntaene animīs caelēstibus īrae?', scansion_notes=['Valid by positional stresses.'], syllables = ['īm', 'pu', 'le', 'rīt', 'Tān', 'taen', 'a', 'ni', 'mīs', 'cae', 'lēs', 'ti', 'bus', 'i', 'rae'])
-
working_line
¶ placeholder for data transformations
-
8.1.12.1.1.14. cltk.prosody.lat.verse_scanner module¶
Parent class and utility class for producing a scansion pattern for a line of Latin verse.
Some useful methods * Perform a conservative i to j transformation * Performs elisions * Accents vowels by position * Breaks the line into a list of syllables by calling a Syllabifier class which may be injected
into this classes constructor.
-
class
cltk.prosody.lat.verse_scanner.
VerseScanner
(constants=<cltk.prosody.lat.scansion_constants.ScansionConstants object>, syllabifier=<cltk.prosody.lat.syllabifier.Syllabifier object>, **kwargs)[source]¶ Bases:
object
The scansion symbols used can be configured by passing a suitable constants class to the constructor.
-
transform_i_to_j
(line)[source]¶ Transform instances of consonantal i to j :type line:
str
:param line: :rtype:str
:return:>>> print(VerseScanner().transform_i_to_j("iactātus")) jactātus >>> print(VerseScanner().transform_i_to_j("bracchia")) bracchia
-
transform_i_to_j_optional
(line)[source]¶ Sometimes for the demands of meter a more permissive i to j transformation is warranted.
- Parameters
line (
str
) –- Return type
str
- Returns
>>> print(VerseScanner().transform_i_to_j_optional("Italiam")) Italjam >>> print(VerseScanner().transform_i_to_j_optional("Lāvīniaque")) Lāvīnjaque >>> print(VerseScanner().transform_i_to_j_optional("omnium")) omnjum
-
accent_by_position
(verse_line)[source]¶ Accent vowels according to the rules of scansion.
- Parameters
verse_line (
str
) – a line of unaccented verse- Return type
str
- Returns
the same line with vowels accented by position
>>> print(VerseScanner().accent_by_position( ... "Arma virumque cano, Troiae qui primus ab oris").lstrip()) Ārma virūmque canō Trojae qui primus ab oris
-
elide_all
(line)[source]¶ Given a string of space separated syllables, erase with spaces the syllable portions that would disappear according to the rules of elision.
- Parameters
line (
str
) –- Return type
str
- Returns
-
calc_offset
(syllables_spaces)[source]¶ Calculate a dictionary of accent positions from a list of syllables with spaces.
- Parameters
syllables_spaces (
List
[str
]) –- Return type
Dict
[int
,int
]- Returns
-
produce_scansion
(stresses, syllables_wspaces, offset_map)[source]¶ Create a scansion string that has stressed and unstressed syllable positions in locations that correspond with the original texts syllable vowels.
:param stresses list of syllable positions :param syllables_wspaces list of syllables with spaces escaped for punctuation or elision :param offset_map dictionary of syllable positions, and an offset amount which is the number of spaces to skip in the original line before inserting the accent.
- Return type
str
-
flag_dipthongs
(syllables)[source]¶ Return a list of syllables that contain a dipthong
- Parameters
syllables (
List
[str
]) –- Return type
List
[int
]- Returns
-
elide
(line, regexp, quantity=1, offset=0)[source]¶ Erase a section of a line, matching on a regex, pushing in a quantity of blank spaces, and jumping forward with an offset if necessary. If the elided vowel was strong, the vowel merged with takes on the stress.
- Parameters
line (
str
) –regexp (
str
) –quantity (
int
) –offset (
int
) –
- Return type
str
- Returns
>>> print(VerseScanner().elide("uvae avaritia", r"[e]\s*[a]")) uv āvaritia >>> print(VerseScanner().elide("mare avaritia", r"[e]\s*[a]")) mar avaritia
-
correct_invalid_start
(scansion)[source]¶ If a hexameter, hendecasyllables, or pentameter scansion starts with spondee, an unstressed syllable in the third position must actually be stressed, so we will convert it: - - | U -> - - | -
- Parameters
scansion (
str
) –- Return type
str
- Returns
>>> print(VerseScanner().correct_invalid_start( ... " - - U U - - U U U U U U - -").strip()) - - - - - - U U U U U U - -
-
correct_first_two_dactyls
(scansion)[source]¶ If a hexameter or pentameter starts with spondee, an unstressed syllable in the third position must actually be stressed, so we will convert it: - - | U -> - - | - And/or if the starting pattern is spondee + trochee + stressed, then the unstressed trochee can be corrected: - - | - u | - -> - - | - -| -
- Parameters
scansion (
str
) –- Return type
str
- Returns
>>> print(VerseScanner().correct_first_two_dactyls( ... " - - U U - - U U U U U U - -")) - - - - - - U U U U U U - -
-