Some Thoughts on AI-Augmented Community Research at the Charlotte Urban Institute

By Kailas Venkitasubramanian in Data Science Research Methods Work

November 11, 2025

I thought of compiling a few thoughts on using AI at the institute or broadly in organizations similar to ours. I’ve been using AI tools in my research for over a year and I still can’t decide if I’m more excited or more unsettled but I feel it’s more of the former these days. But this ambivalence feels like the right starting point for a conversation about where we, as an institute built on community trust and research, actually want to go with this. When I say AI, I mean the generative AI kind.

This isn’t a policy document. It’s a set of observations from someone who has been running experiments in the middle of real project work, combined with questions I think we need to sit with collectively. I’m writing it now, before any formal process kicks off, because I’d rather we talk our way into a shared position than adopting one reactively at some point.

What AI Actually Does Well in Our Work

The noise around AI makes it hard to separate genuinely useful from merely impressive but I’ll talk about what I’ve seen so far.

For quantitative work, coding assistants have been the clearest win. Tasks that used to send me down a Stack Overflow rabbit hole for an hour — a spatial join in R, a stubborn ETL bug, a regression table formatted just so — now take minutes. I’ve been using ChatGPT and Claude for this. What surprised me is that the gains aren’t only for the most technical people on the team. Someone who writes occasional scripts gets as much, sometimes more, out of these tools as a seasoned data scientist. The bar to doing sophisticated data work has dropped in a real and measurable way.

For qualitative work, the picture looks different but the time savings are comparable. Automated transcription is now reliable enough that I’d use it in production. AI-assisted literature synthesis — surfacing themes across fifty papers instead of ten — compresses weeks into hours. And for drafting, the grind of turning structured notes into readable prose has gotten noticeably faster.

Peer organizations are moving in the same direction. The Urban Institute in Washington has started publishing methodological notes and guidelines on AI-assisted research. RTI International is running pilots on AI-enhanced survey tools. University-based policy centers at UNC Chapel Hill and Georgia Tech are building internal toolkits. None of this is happening quietly. If we treat it as optional or peripheral, we will fall behind organizations doing similar work, and that gap will show up in what we can offer partners and communities.

Pain Points

Many. And new ones are getting discovered every day.

AI makes things up. Not occasionally — regularly. I’ve had tools cite papers that don’t exist, generate statistics that are entirely fabricated, and produce code that runs clean but returns wrong answers. This isn’t a quirk you learn to work around once but a feature of how these systems work, at least for now. In our context, the stakes are high enough that this must be said plainly: AI output is a rough draft, not a finding. Everything it produces needs human verification before it goes anywhere near a report, a client, or a public dashboard.

The data governance problem is the one that keeps me up at night. Most of the powerful AI tools — the free versions of ChatGPT, Claude, Gemini — were not built with research confidentiality in mind. When you paste a community partner’s data into one of those interfaces, you don’t fully know where it goes, whether it gets used to train future models, or who eventually has access to it. For a lot of what we work with — individual survey responses, health records from partners, juvenile justice data, housing assistance files, basically any Data Trust dataset — this is not a hypothetical risk. It’s a compliance issue, an IRB issue, and a matter of keeping faith with the communities who trusted us with that information.

Some data should not touch an external AI tool. Personally identifiable information, data covered by FERPA or HIPAA, records shared under data use agreements with explicit restrictions — none of that changes because the tool is convenient and powerful. Whatever framework we develop has to answer one question clearly first: which data can go into which tools, and under what conditions? As an aside, I’ve been working on developing frameworks for how simulated data could be of help in improving productivity without data risks. More on that later.

There’s also a quieter integrity question I haven’t fully resolved. When AI assists in generating analysis or text, what do we owe partners and audiences in terms of disclosure? How do we document AI-assisted workflows in methods sections? The research field broadly is still sorting this out, and I don’t think we’re behind — but it’s a question we should work through together rather than letting each person decide on their own.

Data Governance Comes First

Data governance ideally has to come before tool adoption, before training programs, before any ambitious product ideas. But given the pace at which things are changing, this is challenging.

The practical starting point is knowing what we actually have. Not all data is the same. Publicly aggregated data — the indicators in the Quality of Life Explorer, Census datasets — can go into approved AI tools without significant risk. Internal drafts and non-sensitive analyses sit in a middle tier where enterprise tools with institutional data privacy agreements — Microsoft Copilot through our UNC Charlotte license, Google Gemini through our institutional account — offer reasonable protection. Identified data, data with legal restrictions, data shared under a partner’s trust: that stays off external AI systems.

This isn’t just about legal compliance. The people and organizations who share data with us do so because they believe we’ll treat it carefully. Routing their data through AI systems they didn’t agree to, even in ways that technically clear a regulatory bar, is a kind of breach. As we think about AI in community-facing work especially, transparency about when and how these tools are involved isn’t optional.

The Things I Think We Could Actually Do

I’m not trying to sell AI as a silver bullet. But I do think there are specific things within reach that are worth taking seriously.

The Quality of Life Explorer or the Regional Explorer are good places to start imagining. It’s worked precisely because it takes complex regional data and makes it usable by people who aren’t analysts. A conversational layer on top of that — where a resident or a partner could type “which neighborhoods have seen the biggest increase in housing cost burden?” and get a grounded, plain-language answer from our data — is not a distant aspiration. It’s achievable in the near term.

Also, how can we leverage the plethora of tools available to align with the diverse talent we have? We have qualitative researchers who do great work with communities but have never needed to write code. We have researchers who work with survey data and statistics but don’t have training in computer science to convert workflows to applications. AI tools that you can interact with in plain English have already expanded what both of those people can do independently. That’s not trivial.

Across both possibilities, the pattern is the same: AI doesn’t replace the research judgment we’ve built at the institute — the knowledge of our region, the relationships with partners, the understanding of what questions actually matter. AI compresses the distance between that judgment and a finished, useful product.

Questions I Don’t Have Answers To

There are many things I don’t know, because some of these are genuinely hard.

How do we build AI literacy across a team with very different technical starting points, without boring the people who are already ahead or leaving behind the people who are skeptical? What does responsible disclosure look like when AI is involved in generating analysis — and does the answer change depending on whether it helped write prose versus contributed to the actual findings? How do we vet tools consistently rather than ending up with a patchwork of individual practices and uneven privacy risks? And the one that nags at me most: what do our community partners and the residents whose data we hold have a right to know about how we’re using these tools in research that affects their lives?

These are not rhetorical. They need actual conversation and deliberation, and they need to start now, at least internally.

A Few Near-Term Things Worth Doing

Without getting ahead of any formal process, here are actions that seem low-risk and worth starting now.

First, find out what tools people are already using. My guess is that a fair number of colleagues are already experimenting, which is fine, but we probably don’t have a clear picture of what’s in use, on what kinds of tasks, or whether anyone is inadvertently working with sensitive data in ways that create compliance exposure. Maybe a short internal survey would work?

Second, identify two or three researchers willing to experiment intentionally and report back — ideally spanning qualitative and quantitative work. Peer learning is how practice actually changes. Building some light structure around early experimentation would teach us more, faster, than any policy document. I shared a few of my insights on using AI in the research life-cycle last summer. Frankly, last summer feels like a lifetime ago when I watch the movements in the technology space.

Third, put together a simple data classification reference — one page, which tools are appropriate for which types of data — building on what UNC Charlotte’s Office of OneIT has already published. Researchers need to be able to make that call in the moment, before a formal AI policy exists. Give them something to work from.

And keep the conversation going. The organizations that have handled AI adoption well built shared norms through talk before they tried to codify anything. We should do the same.

I’ve been sitting with these questions for a while, but my vantage point is narrow. Qualitative researchers have a different relationship to these tools than I do. Data scientists see possibilities I probably can’t see from where I sit. Those working directly with communities have a perspective on trust and ethics that I need to hear.

Maybe we should have a forum to share — a workflow that’s working, a problem you ran into, a question you can’t shake. Whichever way, I feel that we need to pace our conversations at the institute on how we adopt and adapt.