8.1.12.1.1. cltk.prosody.lat package¶

8.1.12.1.1.1. Submodules¶

8.1.12.1.1.2. cltk.prosody.lat.clausulae_analysis module¶

Return dictionary of clausulae found in the prosody of Latin prose.

The clausulae analysis function returns a dictionary in which the key is the type of clausula and the value is the number of times it occurs in the text. The list of clausulae used in the method is derived from the 2019 Journal of Roman Studies paper “Auceps syllabarum: A Digital Analysis of Latin Prose Rhythm”. The list of clausulae are mutually exclusive so no one rhythm will be counted in multiple categories.

class cltk.prosody.lat.clausulae_analysis.Clausula(rhythm_name, rhythm)¶

Bases: tuple

property rhythm¶: Alias for field number 1

property rhythm_name¶: Alias for field number 0

class cltk.prosody.lat.clausulae_analysis.Clausulae(rhythms=[Clausula(rhythm_name='cretic_trochee', rhythm='-u--x'), Clausula(rhythm_name='cretic_trochee_resolved_a', rhythm='uuu--x'), Clausula(rhythm_name='cretic_trochee_resolved_b', rhythm='-uuu-x'), Clausula(rhythm_name='cretic_trochee_resolved_c', rhythm='-u-uux'), Clausula(rhythm_name='double_cretic', rhythm='-u--ux'), Clausula(rhythm_name='molossus_cretic', rhythm='----ux'), Clausula(rhythm_name='double_molossus_cretic_resolved_a', rhythm='uuu--ux'), Clausula(rhythm_name='double_molossus_cretic_resolved_b', rhythm='-uuu-ux'), Clausula(rhythm_name='double_molossus_cretic_resolved_c', rhythm='-u-uux'), Clausula(rhythm_name='double_molossus_cretic_resolved_d', rhythm='uu---ux'), Clausula(rhythm_name='double_molossus_cretic_resolved_e', rhythm='-uu--ux'), Clausula(rhythm_name='double_molossus_cretic_resolved_f', rhythm='--uu-ux'), Clausula(rhythm_name='double_molossus_cretic_resolved_g', rhythm='---uuux'), Clausula(rhythm_name='double_molossus_cretic_resolved_h', rhythm='-u---ux'), Clausula(rhythm_name='double_trochee', rhythm='-u-x'), Clausula(rhythm_name='double_trochee_resolved_a', rhythm='uuu-x'), Clausula(rhythm_name='double_trochee_resolved_b', rhythm='-uuux'), Clausula(rhythm_name='hypodochmiac', rhythm='-u-ux'), Clausula(rhythm_name='hypodochmiac_resolved_a', rhythm='uuu-ux'), Clausula(rhythm_name='hypodochmiac_resolved_b', rhythm='-uuuux'), Clausula(rhythm_name='spondaic', rhythm='---x'), Clausula(rhythm_name='heroic', rhythm='-uu-x')])[source]¶

Bases: object

clausulae_analysis(prosody)[source]¶: Return dictionary in which the key is a type of clausula and the value is its frequency. :type prosody: List :param prosody: the prosody of a prose text (must be in the format of the scansion produced by the scanner classes. :rtype: List[Dict[str, int]] :return: dictionary of prosody >>> Clausulae().clausulae_analysis([‘-uuu-uuu-u–x’, ‘uu-uu-uu—-x’]) [{‘cretic_trochee’: 1}, {‘cretic_trochee_resolved_a’: 0}, {‘cretic_trochee_resolved_b’: 0}, {‘cretic_trochee_resolved_c’: 0}, {‘double_cretic’: 0}, {‘molossus_cretic’: 0}, {‘double_molossus_cretic_resolved_a’: 0}, {‘double_molossus_cretic_resolved_b’: 0}, {‘double_molossus_cretic_resolved_c’: 0}, {‘double_molossus_cretic_resolved_d’: 0}, {‘double_molossus_cretic_resolved_e’: 0}, {‘double_molossus_cretic_resolved_f’: 0}, {‘double_molossus_cretic_resolved_g’: 0}, {‘double_molossus_cretic_resolved_h’: 0}, {‘double_trochee’: 0}, {‘double_trochee_resolved_a’: 0}, {‘double_trochee_resolved_b’: 0}, {‘hypodochmiac’: 0}, {‘hypodochmiac_resolved_a’: 0}, {‘hypodochmiac_resolved_b’: 0}, {‘spondaic’: 1}, {‘heroic’: 0}]

8.1.12.1.1.3. cltk.prosody.lat.hendecasyllable_scanner module¶

Utility class for producing a scansion pattern for a Latin hendecasyllables.

Given a line of hendecasyllables, the scan method performs a series of transformation and checks are performed and for each one performed successfully, a note is added to the scansion_notes list so that end users may view the provenance of a scansion.

class cltk.prosody.lat.hendecasyllable_scanner.HendecasyllableScanner(constants=<cltk.prosody.lat.scansion_constants.ScansionConstants object>, syllabifier=<cltk.prosody.lat.syllabifier.Syllabifier object>, optional_tranform=False, *args, **kwargs)[source]¶

Bases: cltk.prosody.lat.verse_scanner.VerseScanner

The scansion symbols used can be configured by passing a suitable constants class to the constructor.

scan(original_line, optional_transform=False)[source]¶

Scan a line of Latin hendecasyllables and produce a scansion pattern, and other data.

Parameters

original_line (str) – the original line of Latin verse
optional_transform (bool) – whether or not to perform i to j transform for syllabification

Return type

Verse

Returns

a Verse object

>>> scanner = HendecasyllableScanner()
>>> print(scanner.scan("Cui dono lepidum novum libellum"))
Verse(original='Cui dono lepidum novum libellum', scansion='  -  U -  U U -   U -   U -  U ', meter='hendecasyllable', valid=True, syllable_count=11, accented='Cui donō lepidūm novūm libēllum', scansion_notes=['Corrected invalid start.'], syllables = ['Cui', 'do', 'no', 'le', 'pi', 'dūm', 'no', 'vūm', 'li', 'bēl', 'lum'])
>>> print(scanner.scan(
... "ārida modo pumice expolitum?").scansion)  
- U -  U U  - U   -  U - U

correct_invalid_start(scansion)[source]¶

The third syllable of a hendecasyllabic line is long, so we will convert it.

Parameters: scansion (str) – scansion string
Return type: str
Returns: scansion string with corrected start

>>> print(HendecasyllableScanner().correct_invalid_start(
... "- U U  U U  - U   -  U - U").strip())
- U -  U U  - U   -  U - U

correct_antepenult_chain(scansion)[source]¶

For hendecasyllables the last three feet of the verse are predictable and do not regularly allow substitutions.

Parameters: scansion (str) – scansion line thus far
Return type: str
Returns: corrected line of scansion

>>> print(HendecasyllableScanner().correct_antepenult_chain(
... "-U -UU UU UU UX").strip())
-U -UU -U -U -X

8.1.12.1.1.4. cltk.prosody.lat.hexameter_scanner module¶

Utility class for producing a scansion pattern for a Latin hexameter.

Given a line of hexameter, the scan method performs a series of transformation and checks are performed and for each one performed successfully, a note is added to the scansion_notes list so that end users may view the provenance of a scansion.

Because hexameters have strict rules on the position and quantity of stressed and unstressed syllables, we can often infer the many stress qualities of the syllables, given a valid hexameter. If the Latin hexameter provided is not accented with macrons, then a best guess is made. For the scansion produced, the stress of a dipthong is indicated in the second of the two vowel positions; for the accented line produced, the dipthong stress is not indicated with any macronized vowels.

class cltk.prosody.lat.hexameter_scanner.HexameterScanner(constants=<cltk.prosody.lat.scansion_constants.ScansionConstants object>, syllabifier=<cltk.prosody.lat.syllabifier.Syllabifier object>, optional_transform=False, *args, **kwargs)[source]¶

Bases: cltk.prosody.lat.verse_scanner.VerseScanner

The scansion symbols used can be configured by passing a suitable constants class to the constructor.

scan(original_line, optional_transform=False, dactyl_smoothing=False)[source]¶

Scan a line of Latin hexameter and produce a scansion pattern, and other data.

Parameters

original_line (str) – the original line of Latin verse
optional_transform (bool) – whether or not to perform i to j transform for syllabification
dactyl_smoothing (bool) – whether or not to perform dactyl smoothing

Return type

Verse

Returns

a Verse object

>>> scanner = HexameterScanner()

>>> print(HexameterScanner().scan(
... "ēxiguām sedēm pariturae tērra negavit").scansion) 
- -  -   - -   U U -  -  -  U  U - U
>>> print(scanner.scan("impulerit. Tantaene animis caelestibus irae?"))
Verse(original='impulerit. Tantaene animis caelestibus irae?', scansion='-  U U -    -   -   U U -    - -  U U  -  - ', meter='hexameter', valid=True, syllable_count=15, accented='īmpulerīt. Tāntaene animīs caelēstibus īrae?', scansion_notes=['Valid by positional stresses.'], syllables = ['īm', 'pu', 'le', 'rīt', 'Tān', 'taen', 'a', 'ni', 'mīs', 'cae', 'lēs', 'ti', 'bus', 'i', 'rae'])
>>> print(scanner.scan(
... "Arma virumque cano, Troiae qui prīmus ab ōrīs").scansion) 
-  U  U -   U  U -    -  -   -   - U  U  - -
>>> # some hexameters need the optional transformations:
>>> optional_transform_scanner = HexameterScanner(optional_transform=True)
>>> print(optional_transform_scanner.scan(
... "Ītaliam, fāto profugus, Lāvīniaque vēnit").scansion) 
- -  -    - -   U U -    - -  U  U  - U
>>> print(HexameterScanner().scan(
... "lītora, multum ille et terrīs iactātus et alto").scansion) 
- U U   -     -    -   -  -   -  - U  U  -  U
>>> print(HexameterScanner().scan(
... "vī superum saevae memorem Iūnōnis ob īram;").scansion) 
-  U U -    -  -  U U -   - - U  U  - U
>>> # handle multiple elisions
>>> print(scanner.scan("monstrum horrendum, informe, ingens, cui lumen ademptum").scansion) 
-        -  -      -  -     -  -      -  - U  U -   U
>>> # if we have 17 syllables, create a chain of all dactyls
>>> print(scanner.scan("quadrupedante putrem sonitu quatit ungula campum"
... ).scansion) 
-  U U -  U  U  -   U U -   U U  -  U U  -  U
>>> # if we have 13 syllables exactly, we'll create a spondaic hexameter
>>> print(HexameterScanner().scan(
... "illi inter sese multa vi bracchia tollunt").scansion)  
-    -  -   - -  -  -  -   -   UU  -  -
>>> print(HexameterScanner().scan(
... "dat latus; insequitur cumulo praeruptus aquae mons").scansion) 
-   U U   -  U  U -   U U -    - -  U  U   -  -
>>> print(optional_transform_scanner.scan(
... "Non quivis videt inmodulata poëmata iudex").scansion) 
-    - -   U U  -  U U - U  U- U U  - -
>>> print(HexameterScanner().scan(
... "certabant urbem Romam Remoramne vocarent").scansion) 
-  - -   -  -   - -   U U -  U  U - -
>>> # advanced smoothing is available via keyword flags: dactyl_smoothing
>>> # print(HexameterScanner().scan(
#... "his verbis: 'o gnata, tibi sunt ante ferendae",
#... dactyl_smoothing=True).scansion) 
#-   -  -    -   - U   U -  -   -  U  U -   -

correct_invalid_fifth_foot(scansion)[source]¶

The ‘inverted amphibrach’: stressed_unstressed_stressed syllable pattern is invalid in hexameters, so here we coerce it to stressed when it occurs at the end of a line

Parameters: scansion (str) – the scansion pattern
Return corrected scansion: the corrected scansion pattern

>>> print(HexameterScanner().correct_invalid_fifth_foot(
... " -   - -   U U  -  U U U -  - U U U  - x")) 
-   - -   U U  -  U U U -  - - U U  - x

Return type: str

invalid_foot_to_spondee(feet, foot, idx)[source]¶

In hexameters, a single foot that is a unstressed_stressed syllable pattern is often just a double spondee, so here we coerce it to stressed.

Parameters

feet (list) – list of string representations of meterical feet
foot (str) – the bad foot to correct
idx (int) – the index of the foot to correct

Return type

str

Returns

corrected scansion

>>> print(HexameterScanner().invalid_foot_to_spondee(
... ['-UU', '--', '-U', 'U-', '--', '-UU'],'-U', 2))  
-UU----U----UU

correct_dactyl_chain(scansion)[source]¶

Three or more unstressed accents in a row is a broken dactyl chain, best detected and processed backwards.

Since this method takes a Procrustean approach to modifying the scansion pattern, it is not used by default in the scan method; however, it is available as an optional keyword parameter, and users looking to further automate the generation of scansion candidates should consider using this as a fall back.

Parameters: scansion (str) – scansion with broken dactyl chain; inverted amphibrachs not allowed
Return type: str
Returns: corrected line of scansion

>>> print(HexameterScanner().correct_dactyl_chain(
... "-   U U  -  - U U -  - - U U  - x"))
-   - -  -  - U U -  - - U U  - x
>>> print(HexameterScanner().correct_dactyl_chain(
... "-   U  U U  U -     -   -   -  -   U  U -   U")) 
-   -  - U  U -     -   -   -  -   U  U -   U

correct_inverted_amphibrachs(scansion)[source]¶

The ‘inverted amphibrach’: stressed_unstressed_stressed syllable pattern is invalid in hexameters, so here we coerce it to stressed: - U - -> - - -

Parameters: scansion (str) – the scansion stress pattern
Return type: str
Returns: a string with the corrected scansion pattern

>>> print(HexameterScanner().correct_inverted_amphibrachs(
... " -   U -   - U  -  U U U U  - U  - x")) 
-   - -   - -  -  U U U U  - -  - x
>>> print(HexameterScanner().correct_inverted_amphibrachs(
... " -   - -   U -  -  U U U U  U- - U  - x")) 
-   - -   - -  -  U U U U  U- - -  - x
>>> print(HexameterScanner().correct_inverted_amphibrachs(
... "-  - -   -  -   U -   U U -  U  U - -")) 
-  - -   -  -   - -   U U -  U  U - -
>>> print(HexameterScanner().correct_inverted_amphibrachs(
... "- UU-   U -   U -  -   U   U U   U-   U")) 
- UU-   - -   - -  -   U   U U   U-   U

8.1.12.1.1.5. cltk.prosody.lat.macronizer module¶

Delineate length of lat vowels.

The Macronizer class places a macron over naturally long Latin vowels. To discern whether a vowel is long, a word is first matched with its Morpheus entry by way of its POS tag. The Morpheus entry includes the macronized form of the matched word.

Since the accuracy of the macronizer largely derives from the accuracy of the POS tagger used to match words to their Morpheus entry, the Macronizer class allows for multiple POS to be used.

Todo

Determine how to disambiguate tags (see logger)

class cltk.prosody.lat.macronizer.Macronizer(tagger)[source]¶

Bases: object

Macronize Latin words.

Macronize text by using the POS tag to find the macronized form within the Morpheus database.

_retrieve_tag(text)[source]¶

Tag text with chosen tagger and clean tags.

Tag format: [('word', 'tag')]

Parameters: text – string
Returns: list of tuples, with each tuple containing the word and its pos tag

:rtype : list

_retrieve_morpheus_entry(word)[source]¶

Return Morpheus entry for word

Entry format: [(head word, tag, macronized form)]

Parameters: word – unmacronized, lowercased word
Ptype word: string
Returns: Morpheus entry in tuples

:rtype : list

_macronize_word(word)[source]¶

Return macronized word.

Parameters: word – (word, tag)
Ptype word: tuple
Returns: (word, tag, macronized_form)

:rtype : tuple

macronize_tags(text)[source]¶

Return macronized form along with POS tags.

E.g. “Gallia est omnis divisa in partes tres,” -> [(‘gallia’, ‘n-s—fb-‘, ‘galliā’), (‘est’, ‘v3spia—’, ‘est’), (‘omnis’, ‘a-s—mn-‘, ‘omnis’), (‘divisa’, ‘t-prppnn-‘, ‘dīvīsa’), (‘in’, ‘r——–’, ‘in’), (‘partes’, ‘n-p—fa-‘, ‘partēs’), (‘tres’, ‘m——–’, ‘trēs’)]

Parameters: text – raw text
Returns: tuples of head word, tag, macronized form

:rtype : list

macronize_text(text)[source]¶

Return macronized form of text.

E.g. “Gallia est omnis divisa in partes tres,” -> “galliā est omnis dīvīsa in partēs trēs ,”

Parameters: text – raw text
Returns: macronized text

:rtype : str

8.1.12.1.1.6. cltk.prosody.lat.metrical_validator module¶

Utility class for validating scansion patterns: hexameter, hendecasyllables, pentameter. Allows users to configure the scansion symbols internally via a constructor argument; a suitable default is provided.

class cltk.prosody.lat.metrical_validator.MetricalValidator(constants=<cltk.prosody.lat.scansion_constants.ScansionConstants object>)[source]¶

Bases: object

Currently supports validation for: hexameter, hendecasyllables, pentameter.

is_valid_hexameter(scanned_line)[source]¶

Determine if a scansion pattern is one of the valid hexameter metrical patterns :type scanned_line: str :param scanned_line: a line containing a sequence of stressed and unstressed syllables :return bool

>>> print(MetricalValidator().is_valid_hexameter("-UU---UU---UU-U"))
True

Return type: bool

is_valid_hendecasyllables(scanned_line)[source]¶

Determine if a scansion pattern is one of the valid Hendecasyllables metrical patterns

Parameters: scanned_line (str) – a line containing a sequence of stressed and unstressed syllables

>>> print(MetricalValidator().is_valid_hendecasyllables("-U-UU-U-U-U"))
True

Return type: bool

is_valid_pentameter(scanned_line)[source]¶

Determine if a scansion pattern is one of the valid Pentameter metrical patterns

Parameters: scanned_line (str) – a line containing a sequence of stressed and unstressed syllables
Return bool: whether or not the scansion is a valid pentameter

>>> print(MetricalValidator().is_valid_pentameter('-UU-UU--UU-UUX'))
True

Return type: bool

hexameter_feet(scansion)[source]¶

Produces a list of hexameter feet, stressed and unstressed syllables with spaces intact. If the scansion line is not entirely correct, it will attempt to corral one or more improper patterns into one or more feet.

Param: scansion: the scanned line

:return list of strings, representing the feet of the hexameter, or if the scansion is wildly incorrect, the function will return an empty list.

>>> print("|".join(MetricalValidator().hexameter_feet(
... "- U U   -     -  - -   -  -     - U  U  -  U")).strip() )
- U U   |-     -  |- -   |-  -     |- U  U  |-  U
>>> print("|".join(MetricalValidator().hexameter_feet(
... "- U U   -     -  U -   -  -     - U  U  -  U")).strip())
- U U   |-     -  |U -   |-  -     |- U  U  |-  U

Return type: List[str]

static hexameter_known_stresses()[source]¶

Provide a list of known stress positions for a hexameter.

Return type: List[int]
Returns: a zero based list enumerating which syllables are known to be stressed.

static hexameter_possible_unstresses()[source]¶: Provide a list of possible positions which may be unstressed syllables in a hexameter. :rtype: List[int] :return: a zero based list enumerating which syllables are known to be unstressed.

closest_hexameter_patterns(scansion)[source]¶

Find the closest group of matching valid hexameter patterns.

Return type: List[str]
Returns: list of the closest valid hexameter patterns; only candidates with a matching

length/number of syllables are considered.

>>> print(MetricalValidator().closest_hexameter_patterns('-UUUUU-----UU--'))
['-UU-UU-----UU--']

static pentameter_possible_stresses()[source]¶

Provide a list of possible stress positions for a hexameter.

Return type: List[int]
Returns: a zero based list enumerating which syllables are known to be stressed.

closest_pentameter_patterns(scansion)[source]¶

Find the closest group of matching valid pentameter patterns.

Return type: List[str]
Returns: list of the closest valid pentameter patterns; only candidates with a matching

length/number of syllables are considered.

>>> print(MetricalValidator().closest_pentameter_patterns('--UUU--UU-UUX'))
['---UU--UU-UUX']

closest_hendecasyllable_patterns(scansion)[source]¶

Find the closest group of matching valid hendecasyllable patterns.

Return type: List[str]
Returns: list of the closest valid hendecasyllable patterns; only candidates with a matching

length/number of syllables are considered.

>>> print(MetricalValidator().closest_hendecasyllable_patterns('UU-UU-U-U-X'))
['-U-UU-U-U-X', 'U--UU-U-U-X']

_closest_patterns(patterns, scansion)[source]¶

Find the closest group of matching valid patterns.

Patterns: a list of patterns
Scansion: the scansion pattern thus far
Return type: List[str]
Returns: list of the closest valid patterns; only candidates with a matching

length/number of syllables are considered.

_build_hexameter_template(stress_positions)[source]¶

Build a hexameter scansion template from string of 5 binary numbers; NOTE: Traditionally the fifth foot is dactyl and spondee substitution is rare, however since it is a possible combination, we include it here.

Parameters: stress_positions (str) – 5 binary integers, indicating whether foot is dactyl or spondee
Return type: str
Returns: a valid hexameter scansion template, a string representing stressed and

unstresssed syllables with the optional terminal ending.

>>> print(MetricalValidator()._build_hexameter_template("01010"))
-UU---UU---UU-X

_build_pentameter_templates()[source]¶

Create pentameter templates.

Return type: List[str]

8.1.12.1.1.7. cltk.prosody.lat.pentameter_scanner module¶

Utility class for producing a scansion pattern for a Latin pentameter.

Given a line of pentameter, the scan method performs a series of transformation and checks are performed, and for each one performed successfully, a note is added to the scansion_notes list so that end users may view the provenance of a scansion.

class cltk.prosody.lat.pentameter_scanner.PentameterScanner(constants=<cltk.prosody.lat.scansion_constants.ScansionConstants object>, syllabifier=<cltk.prosody.lat.syllabifier.Syllabifier object>, optional_transform=False, *args, **kwargs)[source]¶

Bases: cltk.prosody.lat.verse_scanner.VerseScanner

The scansion symbols used can be configured by passing a suitable constants class to the constructor.

scan(original_line, optional_transform=False)[source]¶

Scan a line of Latin pentameter and produce a scansion pattern, and other data.

Parameters

original_line (str) – the original line of Latin verse
optional_transform (bool) – whether or not to perform i to j transform for syllabification

Return type

Verse

Returns

a Verse object

>>> scanner = PentameterScanner()
>>> print(scanner.scan('ex hoc ingrato gaudia amore tibi.'))
Verse(original='ex hoc ingrato gaudia amore tibi.', scansion='-   -  -   - -   - U  U - U  U U ', meter='pentameter', valid=True, syllable_count=12, accented='ēx hōc īngrātō gaudia amōre tibi.', scansion_notes=['Spondaic pentameter'], syllables = ['ēx', 'hoc', 'īn', 'gra', 'to', 'gau', 'di', 'a', 'mo', 're', 'ti', 'bi'])
>>> print(scanner.scan(
... "in vento et rapida scribere oportet aqua.").scansion) 
-   -    -   U U -    - U   U -  U  U  U

make_spondaic(scansion)[source]¶

If a pentameter line has 12 syllables, then it must start with double spondees.

Parameters: scansion (str) – a string of scansion patterns
Return type: str
Returns: a scansion pattern string starting with two spondees

>>> print(PentameterScanner().make_spondaic("U  U  U  U  U  U  U  U  U  U  U  U"))
-  -  -  -  -  -  U  U  -  U  U  U

make_dactyls(scansion)[source]¶

If a pentameter line has 14 syllables, it starts and ends with double dactyls.

Parameters: scansion (str) – a string of scansion patterns
Return type: str
Returns: a scansion pattern string starting and ending with double dactyls

>>> print(PentameterScanner().make_dactyls("U  U  U  U  U  U  U  U  U  U  U  U  U  U"))
-  U  U  -  U  U  -  -  U  U  -  U  U  U

correct_penultimate_dactyl_chain(scansion)[source]¶

For pentameter the last two feet of the verse are predictable dactyls, and do not regularly allow substitutions.

Parameters: scansion (str) – scansion line thus far
Return type: str
Returns: corrected line of scansion

>>> print(PentameterScanner().correct_penultimate_dactyl_chain(
... "U  U  U  U  U  U  U  U  U  U  U  U  U  U"))
U  U  U  U  U  U  U  -  U  U  -  U  U  U

8.1.12.1.1.8. cltk.prosody.lat.scanner module¶

Scansion module for scanning Latin prose rhythms.

class cltk.prosody.lat.scanner.Scansion(punctuation=None, clausula_length=13, elide=True)[source]¶

Bases: object

Prepossesses Latin text for prose rhythm analysis.

SHORT_VOWELS = ['a', 'e', 'i', 'o', 'u', 'y']¶

LONG_VOWELS = ['ā', 'ē', 'ī', 'ō', 'ū']¶

VOWELS = ['a', 'e', 'i', 'o', 'u', 'y', 'ā', 'ē', 'ī', 'ō', 'ū']¶

DIPHTHONGS = ['ae', 'au', 'ei', 'oe', 'ui']¶

SINGLE_CONSONANTS = ['b', 'c', 'd', 'g', 'k', 'l', 'm', 'n', 'p', 'q', 'r', 's', 't', 'v', 'f', 'j']¶

DOUBLE_CONSONANTS = ['x', 'z']¶

CONSONANTS = ['b', 'c', 'd', 'g', 'k', 'l', 'm', 'n', 'p', 'q', 'r', 's', 't', 'v', 'f', 'j', 'x', 'z']¶

DIGRAPHS = ['ch', 'ph', 'th', 'qu']¶

LIQUIDS = ['r', 'l']¶

MUTES = ['b', 'p', 'd', 't', 'c', 'g']¶

MUTE_LIQUID_EXCEPTIONS = ['gl', 'bl']¶

NASALS = ['m', 'n']¶

SESTS = ['sc', 'sm', 'sp', 'st', 'z']¶

_tokenize_syllables(word)[source]¶

Tokenize syllables for word. “mihi” -> [{“syllable”: “mi”, index: 0, … } … ] Syllable properties:

syllable: string -> syllable index: int -> postion in word long_by_nature: bool -> is syllable long by nature accented: bool -> does receive accent long_by_position: bool -> is syllable long by position

Parameters: word (str) – string
Return type: List[Dict]
Returns: list

>>> Scansion()._tokenize_syllables("mihi")
[{'syllable': 'mi', 'index': 0, 'elide': (False, None), 'long_by_nature': False, 'long_by_position': (False, None), 'accented': True}, {'syllable': 'hi', 'index': 1, 'elide': (False, None), 'long_by_nature': False, 'long_by_position': (False, None), 'accented': False}]
>>> Scansion()._tokenize_syllables("ivi")
[{'syllable': 'i', 'index': 0, 'elide': (False, None), 'long_by_nature': False, 'long_by_position': (False, None), 'accented': True}, {'syllable': 'vi', 'index': 1, 'elide': (False, None), 'long_by_nature': False, 'long_by_position': (False, None), 'accented': False}]
>>> Scansion()._tokenize_syllables("audītū")
[{'syllable': 'au', 'index': 0, 'elide': (False, None), 'long_by_nature': True, 'long_by_position': (False, None), 'accented': False}, {'syllable': 'dī', 'index': 1, 'elide': (False, None), 'long_by_nature': True, 'long_by_position': (False, None), 'accented': True}, {'syllable': 'tū', 'index': 2, 'elide': (False, None), 'long_by_nature': True, 'long_by_position': (False, None), 'accented': False}]
>>> Scansion()._tokenize_syllables("ā")
[{'syllable': 'ā', 'index': 0, 'elide': (False, None), 'long_by_nature': True, 'long_by_position': (False, None), 'accented': True}]
>>> Scansion()._tokenize_syllables("conjiciō")
[{'syllable': 'con', 'index': 0, 'elide': (False, None), 'long_by_nature': False, 'long_by_position': (True, None), 'accented': False}, {'syllable': 'ji', 'index': 1, 'elide': (False, None), 'long_by_nature': False, 'long_by_position': (False, None), 'accented': True}, {'syllable': 'ci', 'index': 2, 'elide': (False, None), 'long_by_nature': False, 'long_by_position': (False, None), 'accented': False}, {'syllable': 'ō', 'index': 3, 'elide': (False, None), 'long_by_nature': True, 'long_by_position': (False, None), 'accented': False}]
>>> Scansion()._tokenize_syllables("lingua")
[{'syllable': 'lin', 'index': 0, 'elide': (False, None), 'long_by_nature': False, 'long_by_position': (True, None), 'accented': True}, {'syllable': 'gua', 'index': 1, 'elide': (False, None), 'long_by_nature': False, 'long_by_position': (False, None), 'accented': False}]
>>> Scansion()._tokenize_syllables("abrante")
[{'syllable': 'ab', 'index': 0, 'elide': (False, None), 'long_by_nature': False, 'long_by_position': (False, 'mute+liquid'), 'accented': False}, {'syllable': 'ran', 'index': 1, 'elide': (False, None), 'long_by_nature': False, 'long_by_position': (True, None), 'accented': True}, {'syllable': 'te', 'index': 2, 'elide': (False, None), 'long_by_nature': False, 'long_by_position': (False, None), 'accented': False}]
>>> Scansion()._tokenize_syllables("redemptor")
[{'syllable': 'red', 'index': 0, 'elide': (False, None), 'long_by_nature': False, 'long_by_position': (False, None), 'accented': False}, {'syllable': 'em', 'index': 1, 'elide': (False, None), 'long_by_nature': False, 'long_by_position': (True, None), 'accented': True}, {'syllable': 'ptor', 'index': 2, 'elide': (False, None), 'long_by_nature': False, 'long_by_position': (False, None), 'accented': False}]
>>> Scansion()._tokenize_syllables("nagrante")
[{'syllable': 'na', 'index': 0, 'elide': (False, None), 'long_by_nature': False, 'long_by_position': (False, 'mute+liquid'), 'accented': False}, {'syllable': 'gran', 'index': 1, 'elide': (False, None), 'long_by_nature': False, 'long_by_position': (True, None), 'accented': True}, {'syllable': 'te', 'index': 2, 'elide': (False, None), 'long_by_nature': False, 'long_by_position': (False, None), 'accented': False}]

_tokenize_words(sentence)[source]¶

Tokenize words for sentence. “Puella bona est” -> [{word: puella, index: 0, … }, … ] Word properties:

word: string -> word index: int -> position in sentence syllables: list -> list of syllable objects syllables_count: int -> number of syllables in word

Parameters: sentence (str) – string
Return type: List[Dict]
Returns: list

>>> Scansion()._tokenize_words('dedērunt te miror antōnī quorum.')
[{'word': 'dedērunt', 'index': 0, 'syllables': [{'syllable': 'de', 'index': 0, 'elide': (False, None), 'long_by_nature': False, 'long_by_position': (False, None), 'accented': False}, {'syllable': 'dē', 'index': 1, 'elide': (False, None), 'long_by_nature': True, 'long_by_position': (False, None), 'accented': True}, {'syllable': 'runt', 'index': 2, 'elide': (False, None), 'long_by_nature': False, 'long_by_position': (True, None), 'accented': False}], 'syllables_count': 3}, {'word': 'te', 'index': 1, 'syllables': [{'syllable': 'te', 'index': 0, 'elide': (False, None), 'long_by_nature': False, 'long_by_position': (False, None), 'accented': True}], 'syllables_count': 1}, {'word': 'miror', 'index': 2, 'syllables': [{'syllable': 'mi', 'index': 0, 'elide': (False, None), 'long_by_nature': False, 'long_by_position': (False, None), 'accented': True}, {'syllable': 'ror', 'index': 1, 'elide': (False, None), 'long_by_nature': False, 'long_by_position': (False, None), 'accented': False}], 'syllables_count': 2}, {'word': 'antōnī', 'index': 3, 'syllables': [{'syllable': 'an', 'index': 0, 'elide': (False, None), 'long_by_nature': False, 'long_by_position': (True, None), 'accented': False}, {'syllable': 'tō', 'index': 1, 'elide': (False, None), 'long_by_nature': True, 'long_by_position': (False, None), 'accented': True}, {'syllable': 'nī', 'index': 2, 'elide': (False, None), 'long_by_nature': True, 'long_by_position': (False, None), 'accented': False}], 'syllables_count': 3}, {'word': 'quorum.', 'index': 4, 'syllables': [{'syllable': 'quo', 'index': 0, 'elide': (False, None), 'long_by_nature': False, 'long_by_position': (False, None), 'accented': True}, {'syllable': 'rum', 'index': 1, 'elide': (False, None), 'long_by_nature': False, 'long_by_position': (False, None), 'accented': False}], 'syllables_count': 2}]
>>> Scansion()._tokenize_words('a spes co i no xe cta.')
[{'word': 'a', 'index': 0, 'syllables': [{'syllable': 'a', 'index': 0, 'elide': (False, None), 'long_by_nature': False, 'long_by_position': (False, 'sest'), 'accented': True}], 'syllables_count': 1}, {'word': 'spes', 'index': 1, 'syllables': [{'syllable': 'spes', 'index': 0, 'elide': (False, None), 'long_by_nature': False, 'long_by_position': (True, None), 'accented': True}], 'syllables_count': 1}, {'word': 'co', 'index': 2, 'syllables': [{'syllable': 'co', 'index': 0, 'elide': (True, 'weak'), 'long_by_nature': False, 'long_by_position': (False, None), 'accented': True}], 'syllables_count': 1}, {'word': 'i', 'index': 3, 'syllables': [{'syllable': 'i', 'index': 0, 'elide': (False, None), 'long_by_nature': False, 'long_by_position': (False, None), 'accented': True}], 'syllables_count': 1}, {'word': 'no', 'index': 4, 'syllables': [{'syllable': 'no', 'index': 0, 'elide': (False, None), 'long_by_nature': False, 'long_by_position': (True, None), 'accented': True}], 'syllables_count': 1}, {'word': 'xe', 'index': 5, 'syllables': [{'syllable': 'xe', 'index': 0, 'elide': (False, None), 'long_by_nature': False, 'long_by_position': (True, None), 'accented': True}], 'syllables_count': 1}, {'word': 'cta.', 'index': 6, 'syllables': [{'syllable': 'cta', 'index': 0, 'elide': (False, None), 'long_by_nature': False, 'long_by_position': (False, None), 'accented': True}], 'syllables_count': 1}]
>>> Scansion()._tokenize_words('x')
[]
>>> Scansion()._tokenize_words('atae amo.')
[{'word': 'atae', 'index': 0, 'syllables': [{'syllable': 'a', 'index': 0, 'elide': (False, None), 'long_by_nature': False, 'long_by_position': (False, None), 'accented': True}, {'syllable': 'tae', 'index': 1, 'elide': (True, 'strong'), 'long_by_nature': True, 'long_by_position': (False, None), 'accented': False}], 'syllables_count': 2}, {'word': 'amo.', 'index': 1, 'syllables': [{'syllable': 'a', 'index': 0, 'elide': (False, None), 'long_by_nature': False, 'long_by_position': (False, None), 'accented': True}, {'syllable': 'mo', 'index': 1, 'elide': (False, None), 'long_by_nature': False, 'long_by_position': (False, None), 'accented': False}], 'syllables_count': 2}]
>>> Scansion()._tokenize_words('bar rid.')
[{'word': 'bar', 'index': 0, 'syllables': [{'syllable': 'bar', 'index': 0, 'elide': (False, None), 'long_by_nature': False, 'long_by_position': (True, None), 'accented': True}], 'syllables_count': 1}, {'word': 'rid.', 'index': 1, 'syllables': [{'syllable': 'rid', 'index': 0, 'elide': (False, None), 'long_by_nature': False, 'long_by_position': (False, None), 'accented': True}], 'syllables_count': 1}]
>>> Scansion()._tokenize_words('ba brid.')
[{'word': 'ba', 'index': 0, 'syllables': [{'syllable': 'ba', 'index': 0, 'elide': (False, None), 'long_by_nature': False, 'long_by_position': (False, 'mute+liquid'), 'accented': True}], 'syllables_count': 1}, {'word': 'brid.', 'index': 1, 'syllables': [{'syllable': 'brid', 'index': 0, 'elide': (False, None), 'long_by_nature': False, 'long_by_position': (False, None), 'accented': True}], 'syllables_count': 1}]

tokenize(text)[source]¶

Tokenize text on supplied characters. “Puella bona est. Puer malus est.” -> [ [{word: puella, syllables: […], index: 0}, … ], … ] :rtype: List[Dict] :return:list

>>> Scansion().tokenize('puella bona est. puer malus est.')
[{'plain_text_sentence': 'puella bona est', 'structured_sentence': [{'word': 'puella', 'index': 0, 'syllables': [{'syllable': 'pu', 'index': 0, 'elide': (False, None), 'long_by_nature': False, 'long_by_position': (False, None), 'accented': False}, {'syllable': 'el', 'index': 1, 'elide': (False, None), 'long_by_nature': False, 'long_by_position': (True, None), 'accented': True}, {'syllable': 'la', 'index': 2, 'elide': (False, None), 'long_by_nature': False, 'long_by_position': (False, None), 'accented': False}], 'syllables_count': 3}, {'word': 'bona', 'index': 1, 'syllables': [{'syllable': 'bo', 'index': 0, 'elide': (False, None), 'long_by_nature': False, 'long_by_position': (False, None), 'accented': True}, {'syllable': 'na', 'index': 1, 'elide': (True, 'weak'), 'long_by_nature': False, 'long_by_position': (False, None), 'accented': False}], 'syllables_count': 2}, {'word': 'est', 'index': 2, 'syllables': [{'syllable': 'est', 'index': 0, 'elide': (False, None), 'long_by_nature': False, 'long_by_position': (True, None), 'accented': True}], 'syllables_count': 1}]}, {'plain_text_sentence': ' puer malus est', 'structured_sentence': [{'word': 'puer', 'index': 0, 'syllables': [{'syllable': 'pu', 'index': 0, 'elide': (False, None), 'long_by_nature': False, 'long_by_position': (False, None), 'accented': True}, {'syllable': 'er', 'index': 1, 'elide': (False, None), 'long_by_nature': False, 'long_by_position': (True, None), 'accented': False}], 'syllables_count': 2}, {'word': 'malus', 'index': 1, 'syllables': [{'syllable': 'ma', 'index': 0, 'elide': (False, None), 'long_by_nature': False, 'long_by_position': (False, None), 'accented': True}, {'syllable': 'lus', 'index': 1, 'elide': (False, None), 'long_by_nature': False, 'long_by_position': (False, None), 'accented': False}], 'syllables_count': 2}, {'word': 'est', 'index': 2, 'syllables': [{'syllable': 'est', 'index': 0, 'elide': (False, None), 'long_by_nature': False, 'long_by_position': (True, None), 'accented': True}], 'syllables_count': 1}]}, {'plain_text_sentence': '', 'structured_sentence': []}]

scan_text(text)[source]¶

Return a flat list of rhythms. Desired clausula length is passed as a parameter. Clausula shorter than the specified length can be exluded. :rtype: List[str] :return:

>>> Scansion().scan_text('dedērunt te miror antōnī quorum. sī quid est in mē ingenī jūdicēs quod sentiō.')
['u--uuu---ux', 'u---u--u---ux']

8.1.12.1.1.9. cltk.prosody.lat.scansion_constants module¶

Configuration class for specifying scansion constants.

class cltk.prosody.lat.scansion_constants.ScansionConstants(unstressed='U', stressed='-', optional_terminal_ending='X', separator='|')[source]¶

Bases: object

Constants containing strings have characters in upper and lower case since they will often be used in regular expressions, and used to preserve/a verse’s original case.

This class also allows users to customizing scansion constants and scanner behavior.

>>> constants = ScansionConstants(unstressed="U",stressed= "-", optional_terminal_ending="X")
>>> print(constants.DACTYL)
-UU

>>> smaller_constants = ScansionConstants(
... unstressed="˘",stressed= "¯", optional_terminal_ending="x")
>>> print(smaller_constants.DACTYL)
¯˘˘

HEXAMETER_ENDING¶: The following two constants are not offical scansion terms, but invalid in hexameters

DOUBLED_CONSONANTS¶: Prefix order not arbitrary; one will want to match on extra before ex

8.1.12.1.1.10. cltk.prosody.lat.scansion_formatter module¶

Utility class for formatting scansion patterns

class cltk.prosody.lat.scansion_formatter.ScansionFormatter(constants=<cltk.prosody.lat.scansion_constants.ScansionConstants object>)[source]¶

Bases: object

Users can specify which scansion symbols to use in the formatting.

>>> print(ScansionFormatter().hexameter( "-UU-UU-UU---UU--"))
-UU|-UU|-UU|--|-UU|--
>>> constants = ScansionConstants(unstressed="˘",             stressed= "¯", optional_terminal_ending="x")
>>> formatter = ScansionFormatter(constants)
>>> print(formatter.hexameter( "¯˘˘¯˘˘¯˘˘¯¯¯˘˘¯¯"))
¯˘˘|¯˘˘|¯˘˘|¯¯|¯˘˘|¯¯

hexameter(line)[source]¶

Format a string of hexameter metrical stress patterns into foot divisions

Parameters: line (str) – the scansion pattern
Return type: str
Returns: the scansion string formatted with foot breaks

>>> print(ScansionFormatter().hexameter( "-UU-UU-UU---UU--"))
-UU|-UU|-UU|--|-UU|--

merge_line_scansion(line, scansion)[source]¶

Merge a line of verse with its scansion string. Do not accent dipthongs.

Parameters

line (str) – the original Latin verse line
scansion (str) – the scansion pattern

Return type

str

Returns

the original line with the scansion pattern applied via macrons

>>> print(ScansionFormatter().merge_line_scansion(
... "Arma virumque cano, Troiae qui prīmus ab ōrīs",
... "-  U  U -  U  U  -     UU-   -   - U  U  - -"))
Ārma virūmque canō, Troiae quī prīmus ab ōrīs

>>> print(ScansionFormatter().merge_line_scansion(
... "lītora, multum ille et terrīs iactātus et alto",
... " - U U   -     -    -   -  -   -  - U  U  -  U"))
lītora, mūltum īlle ēt tērrīs iāctātus et ālto

>>> print(ScansionFormatter().merge_line_scansion(
... 'aut facere, haec a te dictaque factaque sunt',
... ' -   U U      -  -  -  -  U  U  -  U  U  -  '))
aut facere, haec ā tē dīctaque fāctaque sūnt

8.1.12.1.1.11. cltk.prosody.lat.string_utils module¶

Utillity class for processing scansion and text.

cltk.prosody.lat.string_utils.remove_punctuation_dict()[source]¶

Provide a dictionary for removing punctuation, swallowing spaces.

:return dict with punctuation from the unicode table

>>> print("I'm ok! Oh #%&*()[]{}!? Fine!".translate(
... remove_punctuation_dict()).lstrip())
Im ok Oh  Fine

Return type: Dict[int, None]

cltk.prosody.lat.string_utils.punctuation_for_spaces_dict()[source]¶

Provide a dictionary for removing punctuation, keeping spaces. Essential for scansion to keep stress patterns in alignment with original vowel positions in the verse.

:return dict with punctuation from the unicode table

>>> print("I'm ok! Oh #%&*()[]{}!? Fine!".translate(
... punctuation_for_spaces_dict()).strip())
I m ok  Oh              Fine

Return type: Dict[int, str]

cltk.prosody.lat.string_utils.differences(scansion, candidate)[source]¶

Given two strings, return a list of index positions where the contents differ.

Parameters

scansion (str) –
candidate (str) –

Return type

List[int]

Returns

>>> differences("abc", "abz")
[2]

cltk.prosody.lat.string_utils.mark_list(line)[source]¶

Given a string, return a list of index positions where a character/non blank space exists.

Parameters: line (str) –
Return type: List[int]
Returns

>>> mark_list(" a b c")
[1, 3, 5]

cltk.prosody.lat.string_utils.space_list(line)[source]¶

Given a string, return a list of index positions where a blank space occurs.

Parameters: line (str) –
Return type: List[int]
Returns

>>> space_list("    abc ")
[0, 1, 2, 3, 7]

cltk.prosody.lat.string_utils.flatten(list_of_lists)[source]¶

Given a list of lists, flatten all the items into one list.

Parameters: list_of_lists –
Returns

>>> flatten([ [1, 2, 3], [4, 5, 6]])
[1, 2, 3, 4, 5, 6]

cltk.prosody.lat.string_utils.to_syllables_with_trailing_spaces(line, syllables)[source]¶

Given a line of syllables and spaces, and a list of syllables, produce a list of the syllables with trailing spaces attached as approriate.

Parameters

line (str) –
syllables (List[str]) –

Return type

List[str]

Returns

>>> to_syllables_with_trailing_spaces(' arma virumque cano ',
... ['ar', 'ma', 'vi', 'rum', 'que', 'ca', 'no' ])
[' ar', 'ma ', 'vi', 'rum', 'que ', 'ca', 'no ']

cltk.prosody.lat.string_utils.join_syllables_spaces(syllables, spaces)[source]¶

Given a list of syllables, and a list of integers indicating the position of spaces, return a string that has a space inserted at the designated points.

Parameters

syllables (List[str]) –
spaces (List[int]) –

Return type

str

Returns

>>> join_syllables_spaces(["won", "to", "tree", "dun"], [3, 6, 11])
'won to tree dun'

cltk.prosody.lat.string_utils.starts_with_qu(word)[source]¶

Determine whether or not a word start with the letters Q and U.

Parameters: word –
Return type: bool
Returns

>>> starts_with_qu("qui")
True
>>> starts_with_qu("Quirites")
True

cltk.prosody.lat.string_utils.stress_positions(stress, scansion)[source]¶

Given a stress value and a scansion line, return the index positions of the stresses.

Parameters

stress (str) –
scansion (str) –

Return type

List[int]

Returns

>>> stress_positions("-", "    -  U   U - UU    - U U")
[0, 3, 6]

cltk.prosody.lat.string_utils.merge_elisions(elided)[source]¶

Given a list of strings with different space swapping elisions applied, merge the elisions, taking the most without compounding the omissions.

Parameters: elided (List[str]) –
Return type: str
Returns

>>> merge_elisions([
... "ignavae agua multum hiatus", "ignav   agua multum hiatus" ,"ignavae agua mult   hiatus"])
'ignav   agua mult   hiatus'

cltk.prosody.lat.string_utils.move_consonant_right(letters, positions)[source]¶

Given a list of letters, and a list of consonant positions, move the consonant positions to the right, merging strings as necessary.

Parameters

letters (List[str]) –
positions (List[int]) –

Return type

List[str]

Returns

>>> move_consonant_right(list("abbra"), [ 2, 3])
['a', 'b', '', '', 'bra']

cltk.prosody.lat.string_utils.move_consonant_left(letters, positions)[source]¶

Given a list of letters, and a list of consonant positions, move the consonant positions to the left, merging strings as necessary.

Parameters

letters (List[str]) –
positions (List[int]) –

Return type

List[str]

Returns

>>> move_consonant_left(['a', 'b', '', '', 'bra'], [1])
['ab', '', '', '', 'bra']

cltk.prosody.lat.string_utils.merge_next(letters, positions)[source]¶

Given a list of letter positions, merge each letter with its next neighbor.

Parameters

letters (List[str]) –
positions (List[int]) –

Return type

List[str]

Returns

>>> merge_next(['a', 'b', 'o', 'v', 'o' ], [0, 2])
['ab', '', 'ov', '', 'o']
>>> # Note: because it operates on the original list passed in, the effect is not cummulative:
>>> merge_next(['a', 'b', 'o', 'v', 'o' ], [0, 2, 3])
['ab', '', 'ov', 'o', '']

cltk.prosody.lat.string_utils.remove_blanks(letters)[source]¶

Given a list of letters, remove any empty strings.

Parameters: letters (List[str]) –
Returns

>>> remove_blanks(['a', '', 'b', '', 'c'])
['a', 'b', 'c']

cltk.prosody.lat.string_utils.split_on(word, section)[source]¶

Given a string, split on a section, and return the two sections as a tuple.

Parameters

word (str) –
section (str) –

Return type

Tuple[str, str]

Returns

>>> split_on('hamrye', 'ham')
('ham', 'rye')

cltk.prosody.lat.string_utils.remove_blank_spaces(syllables)[source]¶

Given a list of letters, remove any blank spaces or empty strings.

Parameters: syllables (List[str]) –
Return type: List[str]
Returns

>>> remove_blank_spaces(['', 'a', ' ', 'b', ' ', 'c', ''])
['a', 'b', 'c']

cltk.prosody.lat.string_utils.overwrite(char_list, regexp, quality, offset=0)[source]¶

Given a list of characters and spaces, a matching regular expression, and a quality or character, replace the matching character with a space, overwriting with an offset and a multiplier if provided.

Parameters

char_list (List[str]) –
regexp (str) –
quality (str) –
offset (int) –

Return type

List[str]

Returns

>>> overwrite(list("multe igne"), r"e\s[aeiou]", " ")
['m', 'u', 'l', 't', ' ', ' ', 'i', 'g', 'n', 'e']

cltk.prosody.lat.string_utils.overwrite_dipthong(char_list, regexp, quality)[source]¶

Given a list of characters and spaces, a matching regular expression, and a quality or character, replace the matching character with a space, overwriting with an offset and a multiplier if provided.

Parameters

char_list (List[str]) – a list of characters
regexp (str) – a matching regular expression
quality (str) – a quality or character to replace

Return type

List[str]

Returns

a list of characters with the dipthong overwritten

>>> overwrite_dipthong(list("multae aguae"), r"ae\s[aeou]", " ")
['m', 'u', 'l', 't', ' ', ' ', ' ', 'a', 'g', 'u', 'a', 'e']

cltk.prosody.lat.string_utils.get_unstresses(stresses, count)[source]¶

Given a list of stressed positions, and count of possible positions, return a list of the unstressed positions.

Parameters

stresses (List[int]) – a list of stressed positions
count (int) – the number of possible positions

Return type

List[int]

Returns

a list of unstressed positions

>>> get_unstresses([0, 3, 6, 9, 12, 15], 17)
[1, 2, 4, 5, 7, 8, 10, 11, 13, 14, 16]

8.1.12.1.1.12. cltk.prosody.lat.syllabifier module¶

Latin language syllabifier. Parses a lat word or a space separated list of words into a list of syllables. Consonantal I is transformed into a J at the start of a word as necessary. Tuned for poetry and verse, this class is tolerant of isolated single character consonants that may appear due to elision.

class cltk.prosody.lat.syllabifier.Syllabifier(constants=<cltk.prosody.lat.scansion_constants.ScansionConstants object>)[source]¶

Bases: object

Scansion constants can be modified and passed into the constructor if desired.

syllabify(words)[source]¶

Parse a Latin word into a list of syllable strings.

Parameters: words (str) – a string containing one lat word or many words separated by spaces.
Return type: List[str]
Returns: list of string, each representing a syllable.

>>> syllabifier = Syllabifier()
>>> print(syllabifier.syllabify("fuit"))
['fu', 'it']
>>> print(syllabifier.syllabify("libri"))
['li', 'bri']
>>> print(syllabifier.syllabify("contra"))
['con', 'tra']
>>> print(syllabifier.syllabify("iaculum"))
['ja', 'cu', 'lum']
>>> print(syllabifier.syllabify("amo"))
['a', 'mo']
>>> print(syllabifier.syllabify("bracchia"))
['brac', 'chi', 'a']
>>> print(syllabifier.syllabify("deinde"))
['dein', 'de']
>>> print(syllabifier.syllabify("certabant"))
['cer', 'ta', 'bant']
>>> print(syllabifier.syllabify("aere"))
['ae', 're']
>>> print(syllabifier.syllabify("adiungere"))
['ad', 'jun', 'ge', 're']
>>> print(syllabifier.syllabify("mōns"))
['mōns']
>>> print(syllabifier.syllabify("domus"))
['do', 'mus']
>>> print(syllabifier.syllabify("lixa"))
['li', 'xa']
>>> print(syllabifier.syllabify("asper"))
['as', 'per']
>>> #  handle doubles
>>> print(syllabifier.syllabify("siccus"))
['sic', 'cus']
>>> # handle liquid + liquid
>>> print(syllabifier.syllabify("almus"))
['al', 'mus']
>>> # handle liquid + mute
>>> print(syllabifier.syllabify("ambo"))
['am', 'bo']
>>> print(syllabifier.syllabify("anguis"))
['an', 'guis']
>>> print(syllabifier.syllabify("arbor"))
['ar', 'bor']
>>> print(syllabifier.syllabify("pulcher"))
['pul', 'cher']
>>> print(syllabifier.syllabify("ruptus"))
['ru', 'ptus']
>>> print(syllabifier.syllabify("Bīthÿnus"))
['Bī', 'thÿ', 'nus']
>>> print(syllabifier.syllabify("sanguen"))
['san', 'guen']
>>> print(syllabifier.syllabify("unguentum"))
['un', 'guen', 'tum']
>>> print(syllabifier.syllabify("lingua"))
['lin', 'gua']
>>> print(syllabifier.syllabify("linguā"))
['lin', 'guā']
>>> print(syllabifier.syllabify("languidus"))
['lan', 'gui', 'dus']
>>> print(syllabifier.syllabify("suis"))
['su', 'is']
>>> print(syllabifier.syllabify("habui"))
['ha', 'bu', 'i']
>>> print(syllabifier.syllabify("habuit"))
['ha', 'bu', 'it']
>>> print(syllabifier.syllabify("qui"))
['qui']
>>> print(syllabifier.syllabify("quibus"))
['qui', 'bus']
>>> print(syllabifier.syllabify("hui"))
['hui']
>>> print(syllabifier.syllabify("cui"))
['cui']
>>> print(syllabifier.syllabify("huic"))
['huic']

_setup(word)[source]¶

Prepares a word for syllable processing.

If the word starts with a prefix, process it separately. :param word: :rtype: List[str] :return:

convert_consonantal_i(word)[source]¶

Convert i to j when at the start of a word.

Return type: str

_process(word)[source]¶

Process a word into a list of strings representing the syllables of the word. This method describes rules for consonant grouping behaviors and then iteratively applies those rules the list of letters that comprise the word, until all the letters are grouped into appropriate syllable groups.

Parameters: word (str) –
Return type: List[str]
Returns

_contains_consonants(letter_group)[source]¶

Check if a string contains consonants.

Return type: bool

_contains_vowels(letter_group)[source]¶

Check if a string contains vowels.

Return type: bool

_ends_with_vowel(letter_group)[source]¶

Check if a string ends with a vowel.

Return type: bool

_starts_with_vowel(letter_group)[source]¶

Check if a string starts with a vowel.

Return type: bool

_starting_consonants_only(letters)[source]¶

Return a list of starting consonant positions.

Return type: list

_ending_consonants_only(letters)[source]¶

Return a list of positions for ending consonants.

Return type: List[int]

_find_solo_consonant(letters)[source]¶

Find the positions of any solo consonants that are not yet paired with a vowel.

Return type: List[int]

_find_consonant_cluster(letters)[source]¶: Find clusters of consonants that do not contain a vowel. :type letters: List[str] :param letters: :rtype: List[int] :return:

_move_consonant(letters, positions)[source]¶

Given a list of consonant positions, move the consonants according to certain consonant syllable behavioral rules for gathering and grouping.

Parameters

letters (list) –
positions (List[int]) –

Return type

List[str]

Returns

get_syllable_count(syllables)[source]¶

Counts the number of syllable groups that would occur after ellision.

Often we will want preserve the position and separation of syllables so that they can be used to reconstitute a line, and apply stresses to the original word positions. However, we also want to be able to count the number of syllables accurately.

Parameters: syllables (List[str]) –
Return type: int
Returns

>>> syllabifier = Syllabifier()
>>> print(syllabifier.get_syllable_count([
... 'Jām', 'tūm', 'c', 'au', 'sus', 'es', 'u', 'nus', 'I', 'ta', 'lo', 'rum']))
11

8.1.12.1.1.13. cltk.prosody.lat.verse module¶

Data structure class for a line of metrical verse.

class cltk.prosody.lat.verse.Verse(original, scansion='', meter=None, valid=False, syllable_count=0, accented='', scansion_notes=None, syllables=None)[source]¶

Bases: object

Class representing a line of metrical verse.

This class is round-trippable; the __repr__ call can be used for construction.

>>> positional_hex = Verse(original='impulerit. Tantaene animis caelestibus irae?',
... scansion='-  U U -    -   -   U U -    - -  U U  -  - ', meter='hexameter',
... valid=True, syllable_count=15, accented='īmpulerīt. Tāntaene animīs caelēstibus īrae?',
... scansion_notes=['Valid by positional stresses.'],
... syllables = ['īm', 'pu', 'le', 'rīt', 'Tān', 'taen', 'a', 'ni', 'mīs', 'cae', 'lēs', 'ti', 'bus', 'i', 'rae'])
>>> dupe = eval(positional_hex.__repr__())
>>> dupe
Verse(original='impulerit. Tantaene animis caelestibus irae?', scansion='-  U U -    -   -   U U -    - -  U U  -  - ', meter='hexameter', valid=True, syllable_count=15, accented='īmpulerīt. Tāntaene animīs caelēstibus īrae?', scansion_notes=['Valid by positional stresses.'], syllables = ['īm', 'pu', 'le', 'rīt', 'Tān', 'taen', 'a', 'ni', 'mīs', 'cae', 'lēs', 'ti', 'bus', 'i', 'rae'])
>>> positional_hex
Verse(original='impulerit. Tantaene animis caelestibus irae?', scansion='-  U U -    -   -   U U -    - -  U U  -  - ', meter='hexameter', valid=True, syllable_count=15, accented='īmpulerīt. Tāntaene animīs caelēstibus īrae?', scansion_notes=['Valid by positional stresses.'], syllables = ['īm', 'pu', 'le', 'rīt', 'Tān', 'taen', 'a', 'ni', 'mīs', 'cae', 'lēs', 'ti', 'bus', 'i', 'rae'])

working_line¶: placeholder for data transformations

8.1.12.1.1.14. cltk.prosody.lat.verse_scanner module¶

Parent class and utility class for producing a scansion pattern for a line of Latin verse.

Some useful methods * Perform a conservative i to j transformation * Performs elisions * Accents vowels by position * Breaks the line into a list of syllables by calling a Syllabifier class which may be injected

into this classes constructor.

class cltk.prosody.lat.verse_scanner.VerseScanner(constants=<cltk.prosody.lat.scansion_constants.ScansionConstants object>, syllabifier=<cltk.prosody.lat.syllabifier.Syllabifier object>, **kwargs)[source]¶

Bases: object

The scansion symbols used can be configured by passing a suitable constants class to the constructor.

transform_i_to_j(line)[source]¶

Transform instances of consonantal i to j :type line: str :param line: :rtype: str :return:

>>> print(VerseScanner().transform_i_to_j("iactātus"))
jactātus
>>> print(VerseScanner().transform_i_to_j("bracchia"))
bracchia

transform_i_to_j_optional(line)[source]¶

Sometimes for the demands of meter a more permissive i to j transformation is warranted.

Parameters: line (str) –
Return type: str
Returns

>>> print(VerseScanner().transform_i_to_j_optional("Italiam"))
Italjam
>>> print(VerseScanner().transform_i_to_j_optional("Lāvīniaque"))
Lāvīnjaque
>>> print(VerseScanner().transform_i_to_j_optional("omnium"))
omnjum

accent_by_position(verse_line)[source]¶

Accent vowels according to the rules of scansion.

Parameters: verse_line (str) – a line of unaccented verse
Return type: str
Returns: the same line with vowels accented by position

>>> print(VerseScanner().accent_by_position(
... "Arma virumque cano, Troiae qui primus ab oris").lstrip())
Ārma virūmque canō  Trojae qui primus ab oris

elide_all(line)[source]¶

Given a string of space separated syllables, erase with spaces the syllable portions that would disappear according to the rules of elision.

Parameters: line (str) –
Return type: str
Returns

calc_offset(syllables_spaces)[source]¶

Calculate a dictionary of accent positions from a list of syllables with spaces.

Parameters: syllables_spaces (List[str]) –
Return type: Dict[int, int]
Returns

produce_scansion(stresses, syllables_wspaces, offset_map)[source]¶

Create a scansion string that has stressed and unstressed syllable positions in locations that correspond with the original texts syllable vowels.

:param stresses list of syllable positions :param syllables_wspaces list of syllables with spaces escaped for punctuation or elision :param offset_map dictionary of syllable positions, and an offset amount which is the number of spaces to skip in the original line before inserting the accent.

Return type: str

flag_dipthongs(syllables)[source]¶

Return a list of syllables that contain a dipthong

Parameters: syllables (List[str]) –
Return type: List[int]
Returns

elide(line, regexp, quantity=1, offset=0)[source]¶

Erase a section of a line, matching on a regex, pushing in a quantity of blank spaces, and jumping forward with an offset if necessary. If the elided vowel was strong, the vowel merged with takes on the stress.

Parameters

line (str) –
regexp (str) –
quantity (int) –
offset (int) –

Return type

str

Returns

>>> print(VerseScanner().elide("uvae avaritia", r"[e]\s*[a]"))
uv   āvaritia
>>> print(VerseScanner().elide("mare avaritia", r"[e]\s*[a]"))
mar  avaritia

correct_invalid_start(scansion)[source]¶

If a hexameter, hendecasyllables, or pentameter scansion starts with spondee, an unstressed syllable in the third position must actually be stressed, so we will convert it: - - | U -> - - | -

Parameters: scansion (str) –
Return type: str
Returns

>>> print(VerseScanner().correct_invalid_start(
... " -   - U   U -  -  U U U U  U U  - -").strip())
-   - -   - -  -  U U U U  U U  - -

correct_first_two_dactyls(scansion)[source]¶

If a hexameter or pentameter starts with spondee, an unstressed syllable in the third position must actually be stressed, so we will convert it: - - | U -> - - | - And/or if the starting pattern is spondee + trochee + stressed, then the unstressed trochee can be corrected: - - | - u | - -> - - | - -| -

Parameters: scansion (str) –
Return type: str
Returns

>>> print(VerseScanner().correct_first_two_dactyls(
... " -   - U   U -  -  U U U U  U U  - -")) 
 -   - -   - -  -  U U U U  U U  - -

assign_candidate(verse, candidate)[source]¶

Helper method; make sure that the verse object is properly packaged.

Parameters

verse (Verse) –
candidate (str) –

Return type

Verse

Returns

8.1.12.1.1. cltk.prosody.lat package¶

8.1.12.1.1.1. Submodules¶

8.1.12.1.1.2. cltk.prosody.lat.clausulae_analysis module¶

8.1.12.1.1.3. cltk.prosody.lat.hendecasyllable_scanner module¶

8.1.12.1.1.4. cltk.prosody.lat.hexameter_scanner module¶

8.1.12.1.1.5. cltk.prosody.lat.macronizer module¶

8.1.12.1.1.6. cltk.prosody.lat.metrical_validator module¶

8.1.12.1.1.7. cltk.prosody.lat.pentameter_scanner module¶

8.1.12.1.1.8. cltk.prosody.lat.scanner module¶

8.1.12.1.1.9. cltk.prosody.lat.scansion_constants module¶

8.1.12.1.1.10. cltk.prosody.lat.scansion_formatter module¶

8.1.12.1.1.11. cltk.prosody.lat.string_utils module¶

8.1.12.1.1.12. cltk.prosody.lat.syllabifier module¶

8.1.12.1.1.13. cltk.prosody.lat.verse module¶

8.1.12.1.1.14. cltk.prosody.lat.verse_scanner module¶

The Classical Language Toolkit

Navigation

Related Topics