Transcript Document
Music Representation: Notation, Conversion, & Acquisition Donald Byrd 18 Oct. 2006 Copyright © 2006, Donald Byrd Review: Representations of Music • Three basic forms (representations) of music – Audio: most important for most people (general public) – MIDI files: often best/essential for some musicians, especially for pop, rock, film/TV – Notation: often best/ essential for musicians (even amateurs) & music scholars – Essential difference: how much explicit structure • Music holdings of Library of Congress: over 10M items – Includes over 6M pieces of sheet music and 100K’s of scores of operas, symphonies, etc.: all notation! • Differences are profound rev. 8 Sep. 2006 2 Review: Basic Representations of Music & Audio Digital Audio Audio (e.g., CD, MP3): like speech Time-stamped Time-stamped Events Events (e.g., MIDI file): like unformatted text Musiclike Notation Music Notation: text with complex formatting 1 Sep. 2006 3 Basic Representations of Music & Audio Audio Time-stamped Events Music Notation Common examples CD, MP3 file Standard MIDI File Sheet music Unit Sample Event Note, clef, lyric, etc. Explicit structure none little (partial voicing information) much (complete voicing information) Avg. rel. storage 2000 1 10 Convert to left - OK job: easy Good job: hard OK job: easy Good job: hard Convert to right 1 note: fairly hard Other: very hard OK job: hard Good job: very hard - Ideal for music bird/animal sounds sound effects speech music music rev. 4 Oct. 2006 4 Review: The Four Parameters of Notes • Four basic parameters of a definite-pitched musical note 1. pitch: how high or low the sound is: perceptual analog of frequency 2. duration: how long the note lasts 3. loudness: perceptual analog of amplitude 4. timbre or tone quality • Above is decreasing order of importance for most Western music • …and decreasing order of explicitness in CMN! 30 Aug. 2006 5 How to Read Music Without Really Trying • CMN shows at least six aspects of music: – NP1. Pitches (how high or low): on vertical axis – NP2. Durations (how long): indicated by note/rest shapes – NP3. Loudness: indicated by signs like p , mf , etc. – NP4. Timbre (tone quality): indicated with words like “violin”, “pizzicato”, etc. – Start times: on horizontal axis – Voicing: mostly indicated by staff; in complex cases also shown by stem direction, beams, etc. • See “Essentials of Music Reading” musical example. 30 Aug. 2006 6 4. Music vs. Text and Other Media ———— Explicit Structure ———— least medium most Salience increasers Music audio events notation loud; thin texture Text audio (speech) ordinary text with markup written text “headlining”: large, bold, etc. Images photo, bitmap PostScript drawing-program file bright color MPEG? Premiere file motion, etc. MEDLINE abstracts ?? Video videotape w/o sound Biological data DNA sequences, 3D protein structures Classification: Surgeon General’s Warning • Classification (ordinary hierarchic) is dangerous – – – – Almost everything in the real world is messy Absolute correlations between characteristics are rare Example: some mammals lay eggs; some are “naked” Example: music genres (crossover, chorus + sax, etc.) • People say “an X has features A, B, C, D…” • Nearly always means “has feature A, and usually also B, C, D…” • Leads to: – People who know better claiming absolute correlations – Arguments among experts over which feature is most fundamental rev. 4 Oct. 06 8 Representation vs. Encoding • Representation: what information is conveyed? – More abstract (conceptual) – Basic = general type of info; specific = exact type • Encoding: how is the information conveyed? – More concrete: in computer (“bits”)…or on paper (“atoms”)!) • One representation can have many encodings – “Atoms” example: music notation in printed or Braille form – “Bits” example: any kind of text in ASCII vs. Unicode – “Bits” example: formatted text in HTML, RTF, .doc 30 Jan. 06 9 Basic and Specific Representations vs. Encodings Basic and Specific Representations (above dotted line) Audio Time-stamped Events Waveform Time-stamped MIDI Csound score Time-stamped expM IDI .WAV Red Book (CD) SMF Csound score Music Notation Gamelan not. Notelist expM IDI File Tablature CM N M ensural not. M usicXM L Finale ETF Encodings (below dotted line) rev. 15 Feb. 10 Selfridge-Field on Describing Musical Information • Cf. Selfridge-Field, E. (1997). Describing Musical Information. • What is Music Representation? (informal use of term!) – Codes in Common Use: solfegge (pitch only), CMN, etc. – “Representations” for Computer Application: “total”, MIDI • Parameters of Musical Information – Contexts: sound, notation/graphical, analytic, semantic; gestural? – Concentrates on 1st three • Processing Order: horizontal or vertical priority • Code Categories – – – – – Sound Related Codes: MIDI & other Music Notation Codes: DARMS, SCORE, Notelist, Braille!?, etc. Music Data for Analysis: Plaine & Easie, Kern, MuseData, etc. Representations of Musical Patterns & Process Interchange Codes: SMDL, NIFF, etc.; almost obsolete! 30 Jan. 06 11 Domains of Musical Information • Independent graphic & performance info common – Cadenzas (classical), swing (jazz), rubato passages (all music) • CMN “counterexamples” show importance of independent graphic and logical info – Debussy: bass clef below the staff – Chopin: noteheads are normal 16ths in one voice, triplets in another • Mockingbird (early 1980’s) pioneered three domains: – Logical: “ note is a qtr note” (= ESF(Selfridge-Field)’s “notation”) – Performance: “ note sounds for 456/480ths of a quarter” (= ESF’s “sound”; also called gestural) – Graphic: “ notehead is diamond shaped” (= ESF’s “ notation”) – Nightingale and other programs followed • SMDL added 4th domain – Analytic: for Roman numerals, Schenkerian level, etc. (= ESF’s “analytic”) 1 Feb. 06 12 Different Classifications of Music Encodings Selfridge -Field Sound -related codes (1): M IDI Sound -related codes (2): Other Codes for Representation and Control Musical Notation Codes (1): D ARMS Musical Notation Codes (2): O ther ASCII Representations Musical Notation Codes (3): G raphical-obje ct Descriptions Musical Notation Codes (4): B raille Codes for Data Management and Analysis (1): Monophonic Representations Codes for Data Management and Analysis (2): Polyphonic Representations Representations of Musical Patterns and Processes Interchange Codes 10 Feb. Byrd Time-stamped MIDI Time-stamped Events + Audio CMN (domains L, G) CMN (domains L, G) CMN (domains L, P, G) CMN: non- computer representation! CMN (emphasizes domain A) CMN (emphasizes domain A) “CMN” (abstracted; emphasizes A) CMN (domains L, P, G, A) 13 Music Notation Software and Intelligence (1) • Cf. Byrd, D. (1994). Music Notation Software and Intelligence. • Cases where famous composers flagrantly violate important rules, yet results are easily readable Fig. 1. Changing time signature in middle of the measure (J.S. Bach) Fig. 2. A measure with four horizontal positions for notes that are all on the downbeat (Brahms) Very different ways to have two clefs in effect at the same time: Fig. 3. Bizarrely obvious (Debussy) Fig. 4. So subtle, must think about the 3/8 meter to see bass and treble clefs are both in effect throughout the measure (Ravel) • Really nothing very strange going on in any of these rev. 15 Feb. 14 Music Notation Software and Intelligence (2) • Rules of CMN interact and aren’t always consistent • Programmers try to help users by having programs do things “automatically” • A good idea if software knows enough to do the right thing “almost all” the time • Notation programs convert CMN to performance (MIDI) and vice-versa => makes things worse • Severo Ornstein’s complaint: programs that assume a defined rhythmic structure 22 Feb. 06 15 Surprise: Music Notation has MetaPrinciples! 1. Maximize readability (intelligibility) – – – – Avoid clutter = “Omit Needless Symbols” Try to assume just the right things for audience Audience for CMN is (primarily) performers General principle of any communication • Applies to talks as well as music notation! – Examples: Schubert (avoid tuplet numerals), Bach (avoid tuplets) 2. Minimize space used – Save space => fewer page turns (helps performer); also cheaper to print (helps publisher) – Squeezing much music into little space is a major factor in complexity of CMN – Especially important for music: real-time, performer’s hands full – Examples: Telemann, Debussy, Ravel (for all, reduce staves) 22 Feb. 06 16 Dimensions of Music Representations (1) • Waveform • Csound Expressive Completeness • M usicXML • Notelist • M IDI (SM F) Structural Generality (After Wiggins et al (1993). A Framework for the Evaluation of Music Representation Systems.) rev. 3 Feb. 17 Dimensions of Music Representations (2) • Expressive completeness – How much of all possible music can the representation express? – Includes synthesized as well as acoustic sounds! – Waveform (=audio) is truly “complete” – Exception, sort of: conceptual music • E.g., Celestial Music for Imaginary Trumpets (notes on 100 ledger lines), Cage: 4’ 33” (of silence), etc. • Structural generality – How much of structure in any piece of music can it express? – Music notation with repeat signs, etc. still expresses nowhere near all possible structure 30 Jan. 06 18 Representation Example: a Bit of Mozart The first few measures of Variation 8 of the “Twinkle” Variations 27 Jan. 19 In Notation Form: Nightingale Notelist • • • • • • • • • • • • • • • • • %%Notelist-V2 file='MozartRepresentationEx' partstaves=2 0 startmeas=193 C stf=1 type=3 C stf=2 type=10 K stf=1 KS=3 b K stf=2 KS=3 b T stf=1 num=2 denom=4 T stf=2 num=2 denom=4 A v=1 npt=1 stf=1 S1 'Variation 8' D stf=1 dType=5 N t=0 v=1 npt=1 stf=1 dur=5 dots=0 nn=72 acc=0 eAcc=3 pDur=228 vel=55 ...... appear=1 R t=0 v=2 npt=1 stf=2 dur=-1 dots=0 ...... appear=1 N t=240 v=1 npt=1 stf=1 dur=5 dots=0 nn=74 acc=0 eAcc=3 pDur=228 vel=55 ...... appear=1 N t=480 v=1 npt=1 stf=1 dur=5 dots=0 nn=75 acc=0 eAcc=2 pDur=228 vel=55 ...... appear=1 N t=720 v=1 npt=1 stf=1 dur=5 dots=0 nn=77 acc=0 eAcc=3 pDur=228 vel=55 ...... appear=1 / t=960 type=1 N t=960 v=1 npt=1 stf=1 dur=4 dots=0 nn=79 acc=0 eAcc=3 pDur=456 vel=55 ...... appear=1 (etc. File size: 1862 bytes) 27 Jan. 20 Music Notation: Attempts at Standard Encodings • XML-based (concept of markup language) – SGML = Standard Generalized Markup Language – “Application” of SGML for music • SMDL = Standard Music Description Language: early & v. powerful, but a flop – XML = eXtensible Markup Language is hugely popular – Applications of XML for music • MusicXML is by far most popular; most verbose (5 notes of Mozart = 270 lines!) • MEI also significant; others include MusiXML, MNML, NIFFML, etc. etc. • Castan’s site www.music-notation.info lists programs importing & exporting each encoding – Gives an idea of which are most important/popular – MusicXML is hands-down winner; next are GUIDO, NIFF, SCORE (early 2006) 27 Feb. 06 21 An Event Form: Standard MIDI File (file dump) • • • • • • • • • • • • • 0: 16: 32: 48: 64: 80: 96: 112: 128: 144: 160: 176: 192: 4D54 6864 726B 0000 0402 0218 0055 00FF 6480 4840 3881 6480 904F 3883 3883 4880 804D 400D FF03 0550 4140 0C90 6480 4440 00 . 0000 0014 0896 0305 0C90 4B40 4880 4F40 FF2F 6961 4330 0C90 0006 00FF 34FF 5069 4A38 0C90 4F40 1890 004D 6E6F 8164 4647 0001 5103 2F00 616E 8164 4D38 1890 4D38 5472 8F00 8043 8164 0003 0B70 4D54 6F00 804A 8164 4F38 8330 6B00 9041 400C 8046 01E0 C000 726B 9048 400C 804D 8360 8050 0000 2B81 9044 4001 5 Feb. 4D54 FF58 0000 3881 904B 400C 9050 4018 3200 6480 3181 FF2F MThd.........‡MT rk......Q..p¿..X .....ñ4./.MTrk.. .U....Piano.êH8Å dÄH@.êJ8ÅdÄJ@.êK 8ÅdÄK@.êM8ÅdÄM@. êO8ÉHÄO@.êO8É`êP 8ÉHÄO@.êM8É0ÄP@. ÄM@../.MTrk...2. ...Pianoè.êA+ÅdÄ A@.êC0ÅdÄC@.êD1Å dÄD@.êFGÅdÄF@../ 22 An Event Form: Standard MIDI File (interpreted) • Header format=1 ntrks=3 division=480 • • • • • Track #1 start t=0 Tempo microsec/MIDI-qtr=749760 t=0 Time sig=2/4 MIDI-clocks/click=24 32nd-notes/24-MIDI-clocks=8 t=2868 Meta event, end of track Track end • • • • • • • • Track #2 start t=0 Meta Text, type=0x03 (Sequence/Track Name) leng=5 Text = <Piano> t=0 NOn ch=1 num=72 vel=56 t=228 NOff ch=1 num=72 vel=64 t=240 NOn ch=1 num=74 vel=56 t=468 NOff ch=1 num=74 vel=64 (etc. File size: 193 bytes) 27 Jan. 23 MIDI (Musical Instrument Digital Interface) (1) • Invented in early 80’s – Dawn of personal computers – Designed as simple (& cheap to implement) real-time protocol for communication between synthesizers – Low bandwidth: 31.25 Kbps • Top bit of byte: 1 = status, 0 = data – Numbers usually 7 bits (range 0-127); sometimes 14 or more • Message types – – – – – Channel Voice Channel Mode System Common System Real-Time System Exclusive 5 Feb. 06 24 MIDI (2) • Important standard Events are mostly Channel Voice msgs – Note On: channel (1-16), note number (0-127), on velocity – Note Off: channel, note number, off velocity • Can change “voice” any time with Program Change msg • A way around the 16-channel limit: cables – may or may not correspond to a physical cable – each cable supports 16 channels independent of others – Systems with 4 (=64 channels) or 8 cables (=128) are common • MIDI Monitor allows watching MIDI in real time – Freeware and open source! 5 Feb. 06 25 MIDI Sequencers • Record, edit, & play SMFs (Standard MIDI Files) • Standard views – Piano roll • often with velocity, controllers, etc., in parallel – Event list – Other: Mixer, “Music notation”, etc. – Standard editing • Adding digital audio – Personal computers & software-development tools have gotten more & more powerful – => "digital audio sequencers”: audio & MIDI (stored in hybrid encodings) • Making results more musical: “Humanize” – Timing, etc. isn’t mechanical—but not really musical! 8 Feb. 06 26 Another Warning: Terminology (1) • A perilous question: “How many voices does this synthesizer have?” • Syllogism – Careless and incorrect use of technical terms is dangerous to your learning very much – Experts use technical terms carelessly most of the time – Beginners often use technical terms incorrectly – Therefore, your learning very much is in danger • Somewhat exaggerated, but only somewhat 5 Feb. 06 27 Another Warning: Terminology (2) • Not-too-serious case: “system” – Confusion because both standard (common) computer term & standard (rare but useful) music term • Serious case: patch, program, timbre, or voice – Vocabulary def.: Patch: referring to event-based systems such as MIDI and most synthesizers (particularly hardware synthesizers), a setting that produces a specific timbre, perhaps with additional features. The terms "voice", "timbre", and "program" are all used for the identical concept; all have the potential to cause substantial confusion and should be avoided as much as possible – “Patch” is the only unambiguous term of the four – …but the official MIDI specification (& almost everything else) talks about “voices” (as in “Channel Voice messages control the instrument's 16 voices”) – …and to change the “voice”, you use a “program change”! 6 Feb. 06 28 Another Warning: Terminology (3) • Some terminology is just plain difficult • Example: “Representation” vs. “Encoding” – Distinction: 1st is more abstract, 2nd more concrete – …but what does that mean? – Explaining milk to a blind person: “a white liquid...” • Don’s precision involves being very careful with terminology, difficult or not – Vocabulary is important source – Cf. other sources – Contributions are welcome 6 Feb. 06 29 Standard MIDI Files (1) • • • • • File format = encoding Standard approved in 1988 Very compact Files made up of chunks with 4-character type One Header chunk (“Mthd”) – Gives format, number of tracks, basis for timing • Any number of Track chunks (“MTrk”) – Stream of MIDI events and metaevents preceded by time – 1st track is always timing track 5 Feb. 30 Standard MIDI Files (2) • Metaevents – Set Tempo (in timing track only) – Text, Lyrics, Key/time signatures, instrument name, etc. • What’s missing? – Voice information limited to 16 channels – Dynamics, beams, tuplets, articulation, expression marks, note spelling, etc.: much less structure than CMN • Attempts to overcome limitations – – – – Expressive MIDI, NotaMIDI, etc. ZIPI In a (more ambitious) way, Csound, etc. None of the limited attempts caught on 5 Feb. 06 31 Separating Representations Doesn’t Work! (1) • OK, I’m being overdramatic – Really “doesn’t work well for many purposes” • We shouldn’t be surprised – Close relative of “Classification is Dangerous to Your Health” • Example: many popular notation encodings (e.g., MusicXML) add event info • Example: multiple domains for notation add in event info (performance domain) • Example: Csound combines audio & events • Hybrid systems 12 Feb. 06 32 Separating Representations Doesn’t Work! (2) • Extreme example of musical necessity: Jimi Hendrix’s version of the Star-Spangled Banner at Woodstock (1969) – Goes from pure melody => noteless texture => back repeatedly – Cf. 2 kinds of notation: CMN & tablature – What would music-IR system do to recognize the StarSpangled Banner? – …or Taps? (a very different problem!) • Attempts have been/are being made to combine all three basic representations 3 Feb. 06 33 Even One Note can be Hairy • Experience in the early days of Kurzweil (ca. 1983) – Piano middle C(!) never sounded “good” • ...except first, low-quality recording • Couldn’t tell why from waveform, spectrogram, etc. – Variable sampling rates were unusable • An expensive mistake: cost ca. $1,000,000 – Scale on the flute didn’t sound realistic to a flutist— but it was – Lesson 1: expectations influence perception – Lesson 2: nothing about music is clear-cut or simple 30 Aug. 2006 34