Reading Respone #3 – Week 12 “Pedagogy”

I cannot help but go down a dark worm hole when I hear the terms ‘post-colonial digital pedagogy’ and feel like all of our digital teaching tools are still littered with very old ideas. I was surprised to be analyzing the lyrics of a Rihanna song with this weeks course materials in the recording of the Digital Caribbean Pedagogies conference, but then again I had also just recently stumbled upon one of the world’s only recorded Sanskrit language digital library on the internet when I was researching what tools may already exist when it comes to ancient eastern philosophies, after spending thousands of dollars on yoga teacher training and never learning that something like this exists when we literally speak this ‘dead’ language every day. When I personally try to understand the origin of the digital pedagogical tools that have life already, my thoughts sadly just go to their shortcomings and then I simply just want to blame our nations past and institutional racism, ablism and and the not so feminist male to female ratio in software development, for whatever is lacking. How can we use the word ‘care’ in the digital aspect of these tools if most academic institutions do not normally afford enough support to help students develop positive tech habits as it pertains to their education, at least in the United States? This is a concern that should be addressed as early as kindergarten. How is it possible that, traditionally now these personal entertainment and communications tools are now expected to be transformed into educational devices? We are facing almost the same problem that television has in the past, just generally speaking how many of us are still tuning in to public access television for educational purposes? Of course, I am merely asking this question to a group of likely native to technology graduate students so the response would probably be a higher amount, but here’s my other question, how many of us are turning to modern social media resources like Instagram to learn new things? In the Digital Carribean Pedagogy conference, they mention that the the sharing of the self and small bits of knowledge has become so fragmented across all of these platforms, my hope is that the digital learning experience doesn’t become so fragmented in itself, but I am afraid with the upheaval from in-person to zoom sessions may contribute to just that, more fragmentation. Roopika Risam alludes to digital humanities pedagogy as an intervention to the post-colonialist just in general, and I think she is correct and Cordell reaffirms this with his eagerness to share Digital Humanities with undergrads, but his mistake in not changing the name effectively ahead of time confusingly so, made the topic uninteresting to young people. This topic is interesting to young people, it just needs to be scaffolded in the right way.

Home Page of Digital Paxton

Workshop Report: “Choosing Your Online Platform: A Crash Course in Scalar”

Early in October I attended a workshop presented by Dr. William D. Fenton, Director of Research and Public Programs at the Library Company of Philadelphia, on “Choosing Your Online Platform: A Crash Course in Scalar”.  Sponsored by the PublicsLab and the GC Digital Initiatives, the workshop was hosted by Dr. Stacy Hartman, Director of the PublicsLab. The workshop offered attendees a wealth of hard-earned lessons Dr. Fenton has accumulated over five years of designing, directing, and maintaining the Scalar-powered digital scholarship project Digital Paxton. As a digital collection, critical edition, and teaching platform built on Scalar, Digital Paxton is an immersive and aesthetically state-of-the-art experience of an 18th century pamphlet war about “a little-known massacre” in 1763 in which “a mob of settlers from Paxtang Township murdered 20 unarmed Susquehannock Indians in Lancaster County, Pennsylvania.” Beyond being a case study of Scalar and offering a comparison with other platforms, the workshop unraveled critical issues and questions about the value and role of digital projects and the digital humanities. Many of the topics we have discussed in our class surfaced through Dr. Fenton’s introduction to Scalar in the context of Digital Paxton.

Evaluating the Need for a Digital Project

Dr. Fenton began the workshop with an illuminating discussion of the elements of a high level needs assessment, which consisted of questions for scholars to consider before beginning a digital project:

  1. What is the relationship of your digital project to your scholarship?
    Dr. Fenton recommended that there be at least a “symbiotic” relationship between the digital project and your scholarship.
  2. What problem does your digital project address?
    In the case of Digital Paxton, creating an online digital website offered a number of benefits including: offering access to a set of artifacts whose last edition was published in 1957 and which suffers from issues such as a restrictive price; an ambiguous distinction between pamphlets, engravings, and political cartoons; many artifacts cited, clipped, or reprinted without context; only a subset of the approximately 71 pamphlets are available via the Internet while others are available only through expensive archival services; finding artifacts is limited by cumbersome search methods that offer little to no “sense of contingency, exchange, and interplay” or of “the gaps between the interpretations” surrounding the massacre, which make up some of the principle analytical goals of the scholarship.
  3. Who is (are) your audience(s)?
    Dr. Fenton suggested that should the audience be limited to a small group, such as a group of fellow scholars or a dissertation committee, platforms such as WordPress or Manifold may offer a better fit in terms of time and effort needed to complete the work and the affordances for online feedback and discussion. In the case of Digital Paxton the audience was envisioned to include more broadly all interested scholars, educators, their students, and members of the public.
  4. How will you measure success?
    Dr. Fenton envisioned the digital project as a way to: “surface materials that give voice to the ‘backcountry’ or borderlands”; “provide researchers access to scans and transcriptions”; “foreground the latest scholarship and pedagogy”; “tell multiple stores about and through the Paxton corpus”; “integrate new materials as identified or as they become available”.  As limitations or critical watch outs, Dr. Fenton identified the problem of a distorted understanding arising from the lack of records; the risk of reproducing colonial biases, assumptions, and erasures; the inability of artifacts to present alternative imaginaries; the need to offer a plurality of perspectives, some of which recenter narratives around the nation of the “Conestoga, their resilience, and their central role in the history of colonial Pennsylvania”.
  5. How much time are you willing to invest?
    Dr. Fenton offered the aphorism digital archivists commonly attribute to digital projects: “free as in puppies not as in beer”. Digital projects absorb any and all time made available. As a result, an estimate of time required to achieve a good enough result in relation to other commitments is essential to the success of the project.
  6. When is your project complete?
    To avoid impinging on other commitments, Dr. Fenton recommended that  consideration be given to how the project might at a certain point be handed off  to an institution or be designed to run with sufficient degree of automated maintenance.
  7. Does your institution support a particular platform?
    A key consideration in the selection of a platform is the extent to which there is infrastructure support from your institution for the platform. Dr. Fenton worked with and received institutional infrastructure support from Fordham University and The Library Company of Philadelphia and sponsorship from the Historical Society of Pennsylvania.

Comparing Platforms

Dr. Fenton offered a comparative analysis of Scalar with three other platforms: WordPress, Omeka, and Manifold. Each platform has different strengths that suit it for different goals of digital scholarship and community engagement.  All platforms are licensed under open source licenses with code repositories hosted on Github.

  1. WordPress, licensed under the GNU Public License, is a good choice in that it offers: free hosting via CUNY Commons; ease of use; great integration with plugins and themes; high customizability; and affordances for online communication. The codebase is built with PHP and Javascript and was first released in 2003.   Notable examples include: The PublicsLab, The CUNY Academic Commons, and Ghost River.
  2. Omeka, licensed under the GNU General Public License, is a good choice for curating images and collections and is popular with the GLAM community (galleries, libraries, archives, and museums). The platform is structured around objects and collections of objects with a considerable degree of customizability. The platform is sponsored by the Corporation for Digital Scholarship. The codebase is built with PHP and Javascript and was first released in 2008.  Notable examples include: The September 11 Digital Archive, Intemperance, and DIY History.
  3. Manifold, licensed under the Educational Community License version 2.0 (“ECL”) of the Apache 2.0 license, was established through the GC Digital Scholarship Lab.  It offers a state-of-the-art user experience based on rapid application development in Javascript and Ruby on Rails.  Manifold is a good choice for projects that benefit most from the transformation of a MS Word, Google Doc, or EPUB document into an online edition. Unlike Omeka and Scalar, Manifold is especially designed for online discourse and academic conversations through advanced annotation and commenting. Version v1.0.0-rc.1 of the codebase was released in March of 2018.  Notable examples include: Debates in the Digital Humanities, Metagaming, and Negro and the Nation.

Presenting Scalar

Dr. Fenton’s primary takeaway regarding Scalar is its effectiveness in the presentation of non-linear datasets and born-digital scholarship. The platform can be run as a paid hosted instance, a personally self-hosted instance, or an institutionally hosted instance. Artifacts are uploaded into a flat ontology and structured around objects and sequences of objects known as “paths”. Scalar’s data entities are modeled on the semantic web’s Resource Description Framework (RDF), which enables compatibility across  schemas. Scalar is a project of The Alliance for Networking Visual Culture with funding support from the Andrew W. Mellon Foundation.  Licensed under the Educational Community License version 2.0 (“ECL”) of the Apache 2.0 license,  the codebase  was beta released in 2013. Notable examples include: A Photographic History of Oregon State University and Black Quotidian (offering a more custom coded landing page with several entry points).

Several distinguishing affordances of Scalar dovetail with the goals of Dr. Fenton’s scholarship including: the non-hierarchical structuring of navigation paths (inviting–or requiring–visitors to discover the content and meaning for themselves); path-making as narrative-making (offering visitors immersive and experiential understanding); multi-directional tagging (offering many bidirectional avenues for discovery); annotations and search options (offering full text transcriptions of the image artifacts and search capability across either titles and description or a full text search of all fields); and contexts as entry points (offering historical overviews and keyword essays as a part of the scholarly apparatus).

Scalar’s information architecture simulates the familiar table of contents metaphor. The table of contents is globally available upon entering the site from the standardized single entry point button on the landing page.  The latest version, 2.5.14, offers a rich media landing page.  For documentation, the University of Southern California hosts a user guide and an introduction built on the platform itself. 

Three limitations of the current version of the platform are: the level of web accessibility (the WAVE web accessibility evaluation tool currently reports a number of errors on the homepage of Digital Paxton); and customizability is currently achieved for the most part at the code level in Javascript and CSS; the number of 3rd party integrations lags behind other platforms.

Finally, Scalar supports annotation features used by Digital Paxton to enable students in classes to submit transcripts of the text residing in images. As an example of the pedagogical expansions of the website, the challenge of transcribing hand written letters and diaries draws students into the study of palaeography.

All in all, Scalar appears to be a significant step up from other platforms for the immersive experience at scale of digital artifacts and a multiplicity of contextual narratives. Assuming that the advantages of non-hierarchical sequences match the analytical and pedagogical goals of the project, Scalar would seem to be a better choice than Omeka. In my explorations of Digital Paxton I have been drawn into the world of colonial Pennsylvania in a way that I could not imagine possible with a book, whether print or digital, or even a museum. As I explore the significance of the tragic and traumatic events of of the massacre in 1763,  I am intrigued by Dr. Fenton’s theses that the manuscripts tell a different story compared to the printed records and that the massacre by the “Paxton Boys” together with propaganda war created a “template” for the subsequent dispossession of Native Americans from their lands through the terrorism and disinformation of white land hungry settlers and their allies. I look forward to considering Scalar as a platform and to exploring, both as a model and as history, the paths, contexts, and narratives Dr. Fenton has created through this well crafted and engrossing digital space. 

Additional insights and suggestions are available from Dr. Fenton’s slide deck presented at the workshop.

A Terrifying Tale of Two Vampire Text Analyses

Last month I took a seasonal dive into vampire folklore and its appearance in literature, film, and video games, thinking about the vampire archetype as it correlates to power, illness/medicine, and of course, im/mortality. (“Seasonal” in this case meaning both Halloween and political season.) As a result of having vampires on the brain, I did a quick analysis of the different spelling variations of the word vampire that I was familiar with appear in English literature: vampyr, vampyre, and vampire, to see the trends in Google’s Ngram tool. I didn’t assume that I would have such meaningful results.

Google Ngram for Vampyr

“Vampyr” alone has a clear bump in published words beginning in the late 1830s which correlates to literature about the opera Der Vampyr, and sometimes to its source material, a stage play with Der Vampyr in the title as well from a similar period. It also increases in popularity as the overall trend in vampire content increases in the late 20th century, though after reading a bit about Der Vampyr, I wonder if there’s a correlation with this spelling, and a BBC miniseries based on the opera in the 90s. Google’s tools largely return novels with “Vampyr” in their title as the source of the trend, but if I were researching the term further I’d want to know if the authors had seen the miniseries and if that influenced their stylistic spelling choice.

The Ngram for “Vampyre” was the richest graph I pulled as the first bump in the timeline correlates with the short story by John Polidori, The Vampyre: A Tale, published in 1819. While I was familiar with the name, I was not aware of its place in (forgive me) the vampire chronicle: Wikipedia’s entry on this revealed that not only is this considered to be (Along with Bram Stoker’s Dracula, later in the century) one of the first of the vampire stories as we know them today, but that it was also the source of the source material for Der Vampyr, the opera of the previous Ngram. More trivia: Polidori’s Vampyre was the “winner” in a contest between Percy Bysshe Shelley, Mary Shelley, Polidori, and Lord Byron (who was credited with writing The Vampyre due to an attribution error for a while). Another famous work submitted to this contest was Frankenstein! Also of note from further Wikipedia diving: Byron references vampires in at least one of his poems from the 1810s as well and to be the inspiration for Polidori’s Vampyre himself, Lord Ruthven!! The literary tea from this exercise!

Google Books Ngram for "Vampire"

“Vampire” on the other hand, has a clear upward trending line that correlates with my understanding of the romantic vampire trope’s ascendance, and when compared, the standard modern English spelling eclipses the other two starting in the 80s. Without any further research I wonder if this correlates with the publication of Anne Rice’s series and also with the rise of another pandemic, or both. Unfortunately I wasn’t able to figure out how to hone into that specific decade, though in the search results Google gave me in the time range chosen by the tool, Rice was the most prevalent author. Many of the other books were anthologies, signifying enough content created by then to do so.

Bram Stoker’s Dracula is another small bump in the early 20th century, which inspired me to do a quick Voyant comparison between the Dracula text and the Vampyr text, as both are available to the public. This leads me to my second terrifying text analysis, via word clouds.

Dracula Word Cloud
The Vampyr word cloud

There were not many surprises except for one: how infrequently the word “blood” appears, since one assumes hematophagy is one of the defining characteristics of the archetype, in the way that it is one of the defining characteristics of a mosquito. Given that my exposure to the vampire archetype is firmly rooted in the 20/21st century, my bias on this characteristic may be overly influenced by my exposure to the vampire of film and television, where themes of the same genre may be weighted differently due to the way the reader/viewer perceives them. All speculative, because the text analysis tools alone cannot give me direct insight into film and television trends, but directionally it is an interesting question to ask.

I was surprised to find that text analysis, using the simplest tools I found, created such a rich study of a subject by simply inquiring. I initially had overwhelmed myself with the concept of text analysis, but I was relieved to find that with a bit of tinkering, the tools invoked a natural sense of curiosity and play, leading to further analysis. (Apologies to all for my tardiness as a result!) As an unintended result of this project, I am inspired to read both of these 19th century works of the early romantic vampire canon.

CUNY Academic Works Workshop

A couple weeks ago, I attended a workshop by Jill Cirasella—Librarian at CUNY on Scholarly Communication—about CUNY Academic Works. As a follow-up with other talks and workshops on open/public access scholarship understood generally, this talk focused on CUNY’s addition to such work: CUNY Academic Works. The platform is a service of the CUNY libraries dedicated to collecting and providing access to the research, scholarship, and creative work of the CUNY; in service to CUNY’s mission as a public university, content in Academic Works is freely available to all.

In distinction from open access platforms, CUNY Academic Works is a public access service, in that it does not require an open access license—all that’s needed is rights to share your work online. CUNY Academic Works, as Jill lays out, is a great opportunity for CUNY-affiliated people to make their work publicly available, and reach wide audiences, including readers you’d have never imagined would read your work. In fact, you can see the ripple effect of your work with visualizations provided by the service.

The service provides:

  • online access to works no otherwise available
  • cost-free online access to works paywalled elsewhere
  • long-term online access to works on impermanent sites

While most of us may be familiar with the platform in that GC dissertations, theses, and capstones projects must be published on it, with CUNY Academic Works you can also upload:

  • Journal articles
  • books and book chapters working papers/reports
  • datasets
  • conference presentations
  • reviews of books, films, etc.
  • open education resources (OER)
  • and other forms of scholarly, pedagogical, or creative work

While a lot of different file types can be uploaded to CUNY Academic Works, dynamic creations can’t be uploaded. Usually, code is uploaded in such cases, and a lot of DH practitioners upload .warp files.

Jill then went into general concepts around publishing, mentioning that in most cases and with most publishers, you are allowed to post some version of your article, noting that most allow some form of self-archiving. Additionally, you can sometimes negotiate your contract, and specifying the terms under which you’d like to publish. You can also sometimes ask after you’ve published that you may want to add to the repository—CUNY Academic Works being one option. She recommended NOT doing so on commercial sites, such as ResearchGate and academic.edu, as these sites sometimes end up being sold, meaning everything disappears. Additionally, these companies actively sell user data for profit.

In regards to actually uploading to CUNY Academic Works, the process is relatively straightforward. You don’t need to create an account with CUNY, but you will need to be affiliated. Submission provides the following inputs:

  • List of places your work will be submitted and live (i.e. GC, State Island, etc.) but can indicate your affiliations later
  • Document type – Publication date
  • Agreement of ownership
  • Embargo period (i.e. a period during which it is unavailable to the public)
  • Keywords
  • Disciplines
  • Language
  • Abstract Field
    • Shouldn’t be copy-pasted; Google scholar will match the abstract to the paywalled version, and may not share the CUNY academic works version if the abstract is the same
  • Additional comments
  • Upload file
    • Can also upload additional files

Lastly, Jill gave some advice to authors when considering publishing options, which I found heartening (the following is directly from her slides):

  • Ask yourself why you write (To share ideas, advance theory, add knowledge? To build a reputation? To be cited? To get a job? To get tenure and promotion?)
  • Research any journal/publisher you’re considering. (Quality? Peer reviewing process? Copyright policy?)
  • If you have the right to share your article online, exercise that right! (Whose interests do paywalls serve?)
  • If you don’t have the right share online, request it.

DH Pedagogy Blog Post: Student Empowerment Through Experimentation

It was interesting to get a peek under the hood into crafting DH curriculum in this week’s readings. Ryan Cordell gave a very clear outline into his trial and error with his early Intro to DH course for undergraduate students. I appreciate his honesty in explaining how and why his department rejected his initial course proposal, and also in his almost confession that DH readings and theory don’t impress undergrad students. They’re unimpressed by the word “digital” because their entire lives are already lived online; although we learn in Risam’s ‘Postcolonial Digital Pedagogy’ that the term “digital native” used to describe students today is complicated. In fact, Risam introduces us to the idea that the fallacy of this is “the assumption that the increased consumption of digital media and technologies produces a deeper understanding of them.” No teacher can assume that all of their students are coming in with the exact same skillset at the start of a semester. Just as those students shouldn’t assume that they have more advanced tech skills than their teacher. Cordell reveals how much he learned from his students’ projects over the course of a semester, and suggests colleagues should follow his same pedagogical approach.

I found a few parallels between the Cordell and Risam pieces. One of these is the attitude they share that DH pedagogy shouldn’t teach specific tech skills to students, but to let students access the skills that they would need by working with an assortment of tools. Risam created the metaphor of the student as a carpenter, building their own knowledge structures from the ground up. Giving them a huge advantage to be able to identify gaps. I found it inspiring that students should be encouraged to create and explore, emphasizing the role of production over consumption. Cordell explains similar thoughts, and argues that that’s the best way to overhaul DH pedagogy.

I agree with the recommendation from both writers to have students experiment through access to many different tech skills. And I appreciated that we were given this same freedom in our Intro to DH course. Just like in the way we were encouraged to approach our praxis assignments for this course, it’s recommended that DH pedagogy should start small by having students working with a focused group of tools initially, experimenting across different modalities, gradually building their toolbox within their own universe. Both writers also agree that student-produced projects are more valuable at showing DH skills learned during the course rather than a final essay, allowing students to better showcase their engagement with the material through interaction. Risam describes how power dynamics in the classroom should shift in this way, rather than having students just learning for the sake of regurgitation at the end which I think is very empowering for the students.

One final theme mentioned in the readings that mirrors our course is teaching students to have a healthy attitude towards failure. When working on both our mapping and text praxis projects we talked about scope creep and how our expectations changed as we worked with the tools. We didn’t always succeed, and it’s OK that this happens. Risam brings this up by looking at the relationship between “blue sky thinking” versus “practicality of implementation”. Like most things in life, things rarely work out the way we intended them to. Teaching students to navigate roadblocks while pursuing the end goal is invaluable to their long term education and overall success. Being able to experiment and make our own way with these DH tools that are (mostly) new to us, is the best way to learn how, why, and when to stir things up and create our own digital spaces. I didn’t have the space here to delve into all of this week’s readings, but I found them insightful and I think educators across all fields can benefit from the recommendations from these pieces.

Data Management Plan Workshop

I attended the Mina Rees library’s workshop on data management plans. A data management plan is usually required in grant applications and papers and it includes the data and data collection methods and procedures for research data, which is the material necessary to come to the project’s conclusion.

We talked about a few reasons to share this data, namely to ensure reproducibility.

We also talked about a few things needed to include in a data management plan:

  • How is the data exposed? What will be shared, who is the audience, is it citable
  • How will it be preserved? CUNY academic works repository was a good example that came up since it is a good repo to make the data accessible from a google search for example. It is important not to archive the data in proprietary format, it should be open, unencrypted and uncompressed

We also discussed some best practices for handling data:

  • Some disciplines have specific data structure standards like ways to label fields.. It is important to follow these depending on your field
  • Column names should be human-readable, not coded — unless a dictionary is included
  • It’s important to consider how NULL variables are represented

Another best practice we talked about and that I wanted to discuss further in this blog post is “context”. Having spreadsheets without a readme and a data abstract almost means that the data will be taken out of context and used in ways it should not be used (to answer questions it cannot answer for example). This brought me back to a chapter of Data Feminism by Catherine D’Ignazio and Lauren Klein. We have read a chapter of this book for the week where we discussed Epistemologies of DH and I have recently read chapter 6 for the Advanced Interactive Data Visualization course. The chapter, entitled “The Numbers Don’t Speak for Themselves”, presents the 6th principle of Data Feminism:

“Principle #6 of Data Feminism is to Consider Context. Data feminism asserts that data are not neutral or objective. They are the products of unequal social relations, and this context is essential for conducting accurate, ethical analysis.”

Klein and D’Ignazio brought up very interesting examples of lack of context and its unwanted repercussions. Being in a time where open-source is a model used and encouraged, it is necessary to consider the impact that one’s data, if published and easily accessible, can have.

The first example that came up was a data-driven report by FiveThirtyEight titled “Kidnapping of Girls in Nigeria Is Part of a Worsening Problem.” The blog aims to show that the number of kidnapping is at a peak by using data from the Global Database of Events, Language and Tone (GDELT). In the report, they said that there was 3608 kidnappings of young women in 2013. But that was not true. The data source they used (GDELT) was a project that collects and parses news reports, which means that their data could have multiple records per kidnapping or any other event since multiple news reports were probably written on that specific event. GDELT might have not clearly explained this in their website and FiveThirtyEight clearly used the wrong data to answer their research questions, resulting in a misleading data visualization.

I know I will keep this in mind when working on future data projects and when including a data management plan for my capstone project.

Finding Home & Care in the PR Syllabus

Engaging with the PR Syllabus site made me quite emotional, a reaction I didn’t expect. As a young adult, nothing remotely similar existed. I spent hours upon hours digging through search engines, Wikipedia pages, and deciphering the credibility of suspicious sites and articles, in order to slowly and arduously piece together a lineage and history that was slowly forgotten and intentionally erased (via colonization and forced migration). But even after all of those years of self-study, my understanding of Puerto Rican history was still flawed and inconsistent.  The one Puerto Rican Studies class I took as an undergraduate was wildly disappointing. Even then, knowing as little as I did, I knew this class lacked historical context and nuance, and lazily reinforced cultural stereotypes. But I didn’t have the language to express these concerns, nor did I know how to advocate for a syllabus created with intention and care.

Marta Effinger-Crichlow’s notion of home resonated deeply here. My need to excavate this history essentially stemmed from a desire to “encounter belonging and care”—a desire to remember and in turn “remain rooted in the diaspora”.  When I felt ready, I found myself teaching back this history to my mother, grandmother, aunts, and uncles—all island born Puerto Ricans. I was deeply invested in our collective “knowing”. But Crichlow prompted me think about how much easier and quite beautiful it could have been to find home in a digital space like the PR Syllabus. How might this have enriched or shifted the trajectory of my family’s lives? How much more involved, actionable, and collaborative could I have been if I had access to something as simple as the PR Syllabus’ list of activist organizations and citizen initiatives?

I am so grateful for the folks who have co-created this immense, but necessary living project.  It is definitely creates opportunities for both physical and digital manifestations of home.

Accessibility Customers

Workshop: Designing for Web Accessibility

I attended a Web Accessibility Workshop and Wikipedia Edit-a-thon on October 22nd hosted by Silvia Gutiérrez De la Torre (College of Mexico) and J. Matthew Huculak (UVic). This session was a great overview for those who are wanting to learn more about how to make their content more accessible for the web and why accessibility’s inclusion in technology and on the web is so important for us all, even if you do not have any disabilities. In addition to folks wanting to support the web’s level of accessibility, this workshop would also be of interest to those who are working with studying textual data on the internet and people getting started with basic web development. I personally chose to attend this workshop because I have a background in teaching VoiceOver for a major tech company for some years and first hand can attest to it’s miracles and frustrations for users.

The session started from the ground up in regards to defining what accessibility might mean when it comes to technological terms and devices, which I’ve come to learn is now referred to as “access technology” and also in defining the audiences that require this type of development in tools. A lot of the focus for this workshop in particular is for those whom experience visual challenges, namely those who use screen reader technologies to view websites and also researchers of any kind. Alongside receiving a general overview of the different types of screen reader technologies that are available and defining their uses and limitations, for example, Apple’s VoiceOver and Microsoft Narrator – both can magically read your computers screen back to you with a simple key stroke, with that we also learned that there is much room for improvement in what text exactly is actually read back to you and the value in having the content of the web be more coherently organized from an even broader perspective than just the accessibility world.

 Over the years, we have all seen the web develop into this magical place of information exchange and while it’s important for all beings to be able to access and understand what is on their computers web browser for their countless benefits, the issue of accessibility on the web spans far greater than just reading a news site back to you, it’s now become more about keeping up with adding well-done and legible descriptive text on images, maps, diagrams, pie charts, videos and spreadsheets and data visualizations of many kinds, which certainly makes this conversation reach into academia’s presence on the world wide web. This makes alt-text or alt-attributes now very important for students just like you and I, of course even more so those who might be blind or students with dyslexia that require the entirety of their school text books to be verbally spoken out loud to them via a screenreader so that they can learn from an illustrated example. A huge take away from this workshop for me personally was that adding alt-attritutes to images is also extremely important for researchers who need to study text descriptions of images and datasets on places like Wikipedia and other research sites, making it doubly important for archival purposes and all throughout academia. Alt-text has the ability to validate a scientific image of a pie chart’s purpose on a scholarly publication, even more, to a blind researcher of student who comes across the article.

In order to understand the roots of the issue, we were brought back to the beginning of the internet’s time and took a look at the ‘Web Content Accessibility Guidelines (WCAG) Overview’ brought to us by the W3 School, a free resource which is also helpful for those wanting to learn HTML, CSS and more. As a former technical trainer with experience using screen reading technology, I was not surprised to see such a calling for better alt-text descriptions on tables, pie charts, maps, venn diagrams, chemistry and digital images, as in my experience we were lucky if you could use the screen reader to find any alt-text at all and it is just only recently that VoiceOver describes what is in a digital image. It was also mentioned that alt-text may be added by artificial intelligence in computing, but in the presentation we were shown they shared some examples of artificially generated alt-text that were very poorly done, thus making this type of work require human intervention and gives me hope that maybe I can get myself a job doing this somewhere.

The other part of this workshop was a Wikipedia Edit-a-Thon, which we would not have been able to participate in unless we were experienced in writing these types of alt-text descriptions. I did decide to skip out on this portion of the session, but to further our learning independently, we were provided with several resources, including the “Image Description Guidelines” by the Diagram Center, which is essentially an incredible guide fulled with examples of appropriate descriptions for alt-text. In my personal opinion, having this alone was worth attending the workshop in the first place as it even includes advise on how to format and layout objects on the web in a way and also how to tone your wording appropriately for any given audience.

Accessibility happens to be a great passion of mine for so many years so I was very glad to see this integrated into our digital humanities workshop offerings and loved that so many people are interested in improving the web’s usability for screen readers and researchers. The photo of me above is of myself teaching VoiceOver for a major tech company. I have complied the resources that were provided below from this workshop.

Web Content Accessibility Guidelines (WCAG) Overview

How to Write Alt Text and Image Descriptions for the visually impaired

Image Description Guidelines

Wikipedia:Manual of Style/Accessibility/Alternative text for images

Workshop: “Dismantling” – ITP @ NYU

ITP at New York University hosted a workshop called “Dismantle/Repurpose”. We were asked to bring old objects to be dismantled: out of date electronics, unworn sweaters, found furniture, etc… While we were using tools to physically deconstruct our objects, each speaker facilitated a discussion regarding the notion of “dismantling as a political/social/meditative act to create connections between the ideas presented and (un)making”.

The workshop began with the implementation of basic safety rules and procedures:

  1. Make sure your electronics are not plugged into and outlet or any source of electricity (battery pack). Use a rubber or non-metal base to work on.
  2. Be aware of static electricity. You can ground yourself with s small piece of metal (a penny or any other coin). Hold for 10-30 seconds.
  3. Be mindful of sharp objects, tools, or breakage when dismantling your objects.
  4. Pull long hair away from face and secure.
  5. Work in a well-lit space.
  6. Make sure you are not under the influence of any substances that can impair judgment, attention, coordination, and motor skills.
  7. If you have any issues with or questions about your object, ask to be placed in a break-out room with a specialist who can support you.

I chose to dismantle an old, two-sided magnification mirror that used to light up. See pictures below (before & after partially dismantling):

 As we began to deconstruct our objects, Dr. Cyd Cipolla, a professor of Feminist Technologies, spoke to us about the relationship between humans and non-human machines. Below are some points I was able to jot down in between dismantling:

  • There is a contested boundary between the concrete and ephemeral/discursive and material. Human-made things are socio-material objects. This means that objects are situated and changed based on social setting and social movement.
  • One problem with machines is that they guide and through your interactions with it. For example, a copy machine will provide you with instructions on how to use it, what to press if you what a certain number of copies, or what to choose if you what it to be printed in black/white or color. It isn’t until the machine stops working how it is intended, that we are able to engage with it differently.
  • The exercise of dismantling an old object of our choosing allowed us to deconstruct the idea of an object along side the material itself. It provided us with an opportunity to dismantle our own internal bias. Ask what could be done differently? What can be changed? What are the possibilities?
  • Cipolla then provided us with a “Manifesto for Dismantling”:
    • Go Slow
    • (Dis)connect
    • Respect the work
    • Always ask questions

The next speaker was Ashley Jane Lewis, an interactive artist/maker and student at NYU’s ITP. She focused on the need to first acknowledge that tech labs and maker spaces are mostly developed, occupied, led by white men. She explained that this creates a web of obstacles for BIPOC folks to have to navigate in spaces that were not created with them in mind. She offered that one way to begin tackling this issue is by dismantling and reconstructing spaces through the use of Codes of Conduct. This would begin to detangle the web and create points of access for BIPOC and differently-abled folks. Lewis also explained that should be a living document and as you make changes and adjustments to it, must keep in mind who you are dismantling for. Think about where you ask questions and make assumptions. Here is the link to the Code of Conduct she co-created at ITP:

https://github.com/ITPNYU/ITP-IMA-Code-of-Conduct

Lena Warnke, a cognitive scientist and PhD candidate at Tufts University, discussed neural connections to bias in the brain. She explained that the neural processes of stereotypes and prejudices are different, therefore required different approaches. The first image where the amygdala, insula, and striatum are highlighted are the effects of prejudice and the other is stereotyping:

Warnke makes it clear that there should be more work done to link neuro and social sciences. But she also explains that neuroscience research is still in its infancy and therefore these connections haven’t been fully explored yet.

Francisco Navas, Digital Producer and Journalist for The Guardian, broke down so much of what we have learned this semester regarding the myth of objectivity.  He explained that his Journalism studies required a promise to seek out the “truth”, provide a broad scope to his readers, and to be “unbiased”. But he quickly learned about “objectivity” is that it is white and male. He then used an onion as a metaphor to peel away the layers of this lens and to dismantle objectivity.

Rashida Kamal is a recent graduate of ITP at NYU. She is a researcher, programmer, and designer. She is also a founding member of the Trash Club, a community platform for investigating waste and waste infrastructures through scholarship, community-driven action, and art. Kamal discussed dismantling e-waste. She explained that we received e-waste after it has been through all of its processes: excavated natural resources, production, distribution, usage, and then waste. She is invested in dismantling and recreating new waste systems that are currently insufficient. Kamal made me think about and question our role as digital humanists in this endeavor. We are very focused on access and creating new avenues to information and technology, but as humanists, shouldn’t e-waste and its environmental impacts be heavily considered in our work? Perhaps an eco-critical approach could reinforce this?

I really loved the way this workshop used the physical practice of dismantling as an opportunity for us to think critically and deeply about the theoretical and methodological elements of this work. During this holistic practice and experience of dismantling, I was able to make many connections, but I was also confronted by many questions and limiting thoughts and feelings around my relationship with technology:

  1. I was hesitant and resistant to dismantling the mirror because I was afraid of doing it wrong. I was afraid of the lack of instruction and structure. I had the freedom to engage with this object in whichever way I wanted, but because I have only interacted with this piece of technology in the way that is intended, I was more hesitant than I anticipated.
  2. I was anxious about not being able to put the object back together. Not because I was attached to it or because I wouldn’t be able to utilize it again (it was already broken), but because I was fixated on the original intention of its use and therefore its “correct” structure. This and my previous point, brought Cyd Cipolla’s mention of the socio-materiality of objects to life for me.
  3. Tools were a huge factor in the ways I could(n’t) dismantle/reconstruct my object. This brought up a few things and reminded me of some of our class discussions: Which tools (technologies) are best for this project? Do I have access to these tools? If I don’t, how can I hack (or repurpose) the tools that I do have in order to get as close as possible to my desired outcome? What are the limits of these tools and how can I make them better?

Again, this was a wonderful workshop and great learning experience. I encourage everyone to attempt this practice and take note of your process!

Text Analysis Praxis: An Assessment of Computational Techniques in “A Computational Analysis of Constitutional Polarization”

In a paper published early last year in the Cornell Law Review entitled “A Computational Analysis of Constitutional Polarization”, law professors David E. Pozen, Eric L. Talley, Julian Nyarko describe their efforts to analyze remarks published in the US Congressional Record between 1873 and 2016. Their principle research question asks “whether and to what extent major political blocs in the United States have diverged in the ways they think and talk about the Constitution” (2019, 7). This praxis assignment undertakes a preliminary dissection of the project’s code and data made available by the authors. Through this dissection, it seeks to contribute toward (1) the development of a methodology of critique of computational and statistical techniques and (2) perhaps more importantly a way to ascertain the existence or not of confirmation bias.

While some of the underlying assumptions related to claims of “polarization” and “talking past each other” call for an assessment of the argument’s overall logic, Pozen, Tally, and Nyarko offer an impressively robust set of techniques and methodologies, which incorporate a number of self-critical considerations of their approach and strategies. Included in these considerations are: The attempt to correct problems in the data relating to misspellings and OCR failures; the use of five dictionaries containing varying levels of semantically coarse constitutional terms and n-grams; the use of several classifiers including the Multinomial Naive Bayes classifier, the Multilayer Perceptron classifier, the K-Neighbors classifier, the Gaussian Process classifier, the Decision Trees classifier, and the C-Support Vector Classification classifier; the use of three measures of classifier performance, including the standard rate of correct classification, the F1 measure, and the AUC-ROC measure; the generation of word frequencies and tag clouds; attempts to control for additional variables including the length of remarks; and the comparative analysis of congressional language with language in newspaper editorials.

Pozen, Tally, and Nyarko’s central finding is that “[r]elative to the early and mid-twentieth century, it has become substantially easier for an algorithmic classifier to predict, based solely on the semantic content of a constitutional utterance, whether a Republican/conservative or a Democrat/liberal is speaking” (2019, 4). “Beginning around 1980, our classifier thus finds it increasingly easy to predict the political party of a congressional speaker” (2019, 38). Thus according to the findings of the analysis “constitutional polarization…has exploded in Congress over the past four decades.” The link between predictability of ideology and polarization invites questions about the extent to which “ideologically coherent and distinct” unequivocally translates to “[d]ivision into two sharply contrasting groups or sets of opinions or beliefs” (2019, 34, 8).

Before examining the project’s use of the computational techniques, an overview follows of the provenance of the data. The data used in the analysis–which includes “13.5 million documents, comprising a total of 1.8 billion words spoken by 37,059 senators and representatives”–comes from a dataset prepared by Matthew Gentzkow (member of the department of Economics department at Stanford University and the private non-profit research institute National Bureau of Economic Research (NBER)), Jesse M. Shapiro (a member of the Economics department of Brown University and a member of NBER), and Matt Taddy (affiliated with Amazon) (2019, 18).

For the OCR scans of the print volumes of the Congressional Record, Gentzkow, Shapiro, and Taddy rely on HeinOnline, a commercial internet database service owned by William S. Hein & Co., Inc., based in Buffalo, New York, and which specializes in publishing legal materials.  Gentzkow, Shapiro, and Taddy received funding from the Initiative on Global Markets and the Stigler Center at Chicago Booth, the National Science Foundation, the Brown University Population Studies and Training Center, and the Stanford Institute for Economic Policy Research (SIEPR), and resources provided by the University of Chicago Research Computing Center.

The data provided by Pozen, Tally, and Nyarko consists of a 1.1 GB binary file containing the word embeddings used for the classifier. Additionally, there is a 0.4 MB zip file containing a CSV file for each of the 72 congressional sessions containing the frequencies for the n-grams used for the tag clouds.

Pozen, Tally, and Nyarko apply a number of preprocessing transformations to the corpus, including the creation of word embeddings as a way to correct for misspellings, miscodings, and OCR issues. One transformation the authors did not undertake is the removal of stop words due to the fact that a number of constitutional phrases contain stop words. The following table presents a summary included in their report of the breakdown of the corpus.

Summary Statistics of Congressional Record Corpus

(Click on Image to Enlarge)

While the Python and R code provided by the authors does not run without modification (the Python code is saved to files as Jupyter notebook code blocks), there is enough code that together with the methodology described in the project’s paper enables the discernment of the processing paths and implementation of algorithms. The speeches are vectorized in a pipeline as part of the process of training the models. In terms of predicting the party (Republican vs. Democrat) and ideology (Conservative vs. Liberal), the project uses scikit-learn’s multinomial naïve Bayes classifier as its primary classifier and a cross validation predictor that follows a 5-fold cross validation strategy. The data is split into 80% training data and 20% testing data. The R code is used to create the line charts and scatter plots from the results of the classifications utilize ggplot2 and the cowplot add-on.

The project includes the application of a number of additional computational techniques and secondary findings, including:

  • that polarization has grown faster in constitutional discourse than in nonconstitutional discourse;
  • that conservative-leaning speakers have driven this trend;
  • that members of Congress whose political party does not control the presidency or their own chamber are significantly more likely to invoke the Constitution in some, but not all, contexts; and
  • that contemporary conservative legislators have developed an especially coherent constitutional vocabulary, with which they have come to “own” not only terms associated with the document’s original meaning but also terms associated with textual provisions such as the First Amendment. (2019, 1)

Further evaluation of the project’s code would be needed in order to complete an assessment of the techniques and methodologies underlying these secondary findings.

Conclusion

Through a preliminary evaluation of the code, data, and computational techniques underlying “A Computational Analysis of Constitutional Polarization”, this praxis assignment has attempted a critical assessment of the methodologies Pozen, Talley, Nyarko leverage in their research. While the ample use of self-critical analysis argues against confirmation bias, there would nevertheless appear to be weakness in linking “idealogical coherence” with the claim of “talking past each other”. As one of a number of examples of further research, Pozen, Tally, and Nyarko suggest the application of computational techniques to validate or invalidate the argument made by Aziz Rana that “the culture of ‘constitutional veneration’ is a relatively recent phenomenon bound up with the Cold War effort to justify American imperial ambitions” (2019, 68).

Works Cited
Gentzkow, Matthew, Jesse M. Shapiro, and Matt Taddy. 2018. Congressional Record for the 43rd-114th Congresses: Parsed Speeches and Phrase Counts. Palo Alto, CA: Stanford Libraries [distributor], 2018-01-16. https://data.stanford.edu/congress_text

Pozen, David E., Eric L. Talley, Julian Nyarko. 2019. “A Computational Analysis of Constitutional Polarization” In Cornell Law Review, Vol. 105, pp. 1-84, 2019. SSRN: https://ssrn.com/abstract=3351339.