Obsidian Leisure’s Avowed is a significant technical achievement for the fabled role-playing recreation studio. Although the corporate has expertise with 3D motion RPGs like Fallout New Vegas and The Outer Worlds, Avowed brings the fantasy world of the corporate’s isometric Pillars of Eternity sequence to an enormous 3D world. Like its predecessors, Avowed permits gamers to make a dizzying variety of determination in gameplay and dialogue that every one intersect and form their journey. Mix these branching selections with the sport’s developments in first-person fight and animation, and you’ve got an exceptionally complicated recreation liable to launching with a distressing variety of bugs.
However that did not occur. Avowed launched to vital acclaim, with some critics noting the sport is not as “glitchy” as lots of Obsidian’s beloved older titles. The studio’s repute for buggy video games was so noteworthy that on the 2025 Recreation Builders Convention, Obsidian QA lead David Benefield briefly talked about it in his presentation on Obsidian’s enhancements to its QA workflow.
What are these enhancements? Based on Benefield, Obsidian has spent the final decade (for the reason that launch of 2016’s Tyranny) restructuring its QA division to work extra intently with the remainder of the studio. QA testers turned QA analysts, and as a substitute of solely working assessments on builds of video games like Avowed, the QA workforce started working with designers of all stripes to evaluation their work in Obsidian’s narrative software and Unreal Engine, recognizing bugs earlier than anybody hit the “commit” button.
That course of might sound dauting—however if you wish to bolster your QA workforce, Benefield mentioned you’ll be able to boil the method down to at least one phrase: “practice your QA workforce utilizing no matter strategies you are coaching your designers.”
Obsidian’s QA testers received entry to Tyranny’s narrative instruments
The yearslong journey to reinvent Obsidian’s QA division started in 2015 whereas creating the CRPG Tyranny. Dialogue in Tyranny is accompanied by portraits of characters in several premade “poses” that illustrate their emotional state. Initially these poses have been to be arrange by the narrative workforce working within the Obsidian narrative software, however in line with Benefield, that workforce had grow to be slowed down creating quests and content material for the sport, and work on implementing poses was falling not on time.
Implementing these poses wasn’t a fancy course of, it simply required hours and hours of labor, and the QA workforce had the bandwidth to take up the duty. However with entry to the software they started to comprehend there was masses of applied content material they’d by no means seen or examined earlier than.
“We discovered strains that had by no means been examined, or, in some instances, strains you could not even attain as a participant as a consequence of a bug,” he mentioned. “Generally they have been total quest branches, small and huge, hidden inside these recordsdata that until you stumbled into it as a participant, or it was documented someplace, you would not know that it is even there, which means…QA could not [log] the bug if we did not know the bug was current.”
Picture by way of Obsidian Leisure/Paradox Interactive.
Benefield was a tester on the time, and whereas he was implementing poses, he noticed a doubtlessly game-breaking bug tied to the sport’s repute system. Within the recreation’s opening hours, gamers steadiness calls for from totally different factions like “The Disfavored” and “The Scarlet Forest” to garner repute, culminating in a scene that may cement their preliminary allegiance. The scene checks the participant’s repute with every faction, which is calculated by numbers that go up or down relying on totally different selections. Whichever faction the participant has a greater repute with will decide what scene performs out and the way the sport progresses.
As a result of Benefield was working within the narrative software and will see the quantity values and narrative node pathways, he did the mathematics and located it was attainable for gamers to make a exact set of selections that will finish with equal repute values with every faction. This was not an meant end result, and the participant would not be capable to progress the story.
He confirmed the bug to his lead…who congratulated him on his initiative however mentioned they could not file a ticket until he reproduced it in a construct. It took him two hours to check and retest his idea, and the repair took 30 seconds. “I discovered it personally irritating to see I might discovered such a high-severity bug, however the fee and time to breed it could show to be better than the time to repair it,” he mentioned.
Thankfully, Obsidian listened to his suggestions, and after this Tyranny bumped the standing of its QA testers as much as QA analysts (growing their pay as properly) and created a course of for analysts to evaluation quest and dialogue node timber earlier than they went into the construct.
Increasing in-tool testing on Avowed
Benefield took just a few years away from Obsidian to work as a producer at Nexon, however was employed again on the firm as a QA lead on Avowed, the place he started extra rigorously implementing this course of. Avowed‘s dialog nodes weren’t bigger or extra complicated than Tyranny, however now the QA workforce additionally needed to observe animation, audio, and different gameplay bugs that got here with first-person fight and animated conversations.
Utilizing a pattern dialog with a service provider from early in Avowed that checks if a selected social gathering member was current, Benefield confirmed 3 ways a mistake within the narrative software might result in a bug within the recreation. If a designer created a node the place a “bark” (a line that happens with out dialogue UI) transitions right into a full dialogue sequence, the dialog breaks. If a designer forgets to determine a speaker when organising a node, or the speaker was deleted after the file was created, the dialog breaks. And if somebody forgets to place a “purple” discuss node (that ends the dialog) on the finish of their sequence, the dialog breaks. All three of those bugs got here up “dozens” of occasions when making Avowed, they usually’re “a lot simpler” to identify contained in the instruments than in the event that they’re reported from inside the sport.

Picture by Bryant Francis.
Within the technique of catching these bugs and different logic breaks, the testing workforce started to identify extra superior (Benefield known as them “enjoyable”) scripting errors, recognizing them within the recreation’s knowledge and flagging the narrative designer. This saved time for each groups since narrative designers now knew the foundation explanation for a bug fairly than being advised the symptom. “It additionally saves QA time by discovering extra bugs-per-minute than in the identical time they’d spend testing content material within the recreation.”
This course of wasn’t restricted to Obsidian’s in-house testers—exterior testers from service supplier QLOC have been additionally given this stage of entry. Testers from each firms grew so proficient at working with the narrative software that Benefield started pondering—what in the event that they utilized this course of to Unreal Blueprints as properly?
Obsidian’s Unreal Engine designers embraced working with QA
After sufficient time with this new course of (and a wholesome vacation break), Benefield started drafting a pitch for the remainder of the studio. “If the analysts know these instruments so properly, they usually know the sport so properly, the one items they’re lacking are how triggers, set off volumes, and blueprints work on the Unreal facet,” he argued. “So what if we received them that data too?”
The pitch triggered a little bit of “imposter syndrome” in Benefield. When he first joined Obsidian, the corporate’s org chart stored QA siloed away from the remainder of the design workforce, and there wasn’t a whole lot of skilled overlap between the totally different ends of the corporate. There wasn’t a mandated divide such as you would possibly discover at different recreation studios (this was the tail finish of an period the place some firms forbade QA from ever talking with groups exterior their division), however it wasn’t what you’d name an in depth relationship. Although the departments turned nearer through the years, the lingering specter of viewing QA because the grunts on the backside of the group was nonetheless there.
Thankfully—and Benefield mentioned that is one among his “favourite components” of working at Obsidian—numerous designers shortly warmed as much as the concept, and have been prepared to present it a shot. This led to the creation of the “joint evaluation session,” the place testers and designers stepped via a quest whereas reviewing the move of data in Unreal.

Picture by way of Obsidian Leisure/Microsoft.
“As they undergo, the designer calls out each set off and script used on the search, actually pulling them up on a shared display for the analyst to see and ask questions,” Benefield mentioned. “We additionally file these so we may be very temporary about our notes and actually simply give attention to what’s on display.”
“As a result of QA is simply ever seeing the way it does play out, this provides [designers] a chance to say ‘wait, that is what it ought to have been doing, I did not notice that is a bug.'”
These conferences run for an hour (with a number of conferences scheduled if a quest takes longer). Generally bugs have been so apparent they may very well be mounted on a name.
Bringing QA and design collectively improved morale
Based on Benefield, Obsidian’s designers liked these classes. Bugs have been squashed, QA realized extra about content material they wanted to check via typical means, manufacturing continued extra effectively, and possibly most significantly, it was a significant enhance for morale.
“I did not see this one coming, however it was very noticeable to everyone concerned and everyone adjoining to those classes,” Benefield recalled. “Of us have been having fun with them. [Designers] felt a lot better about their quests after they’d been beat on throughout a joint evaluation session.
Even after these classes, designers and QA analysts have been extra comfy reaching out to one another with questions and feedback.
Obsidian made numerous different enhancements to the QA course of throughout the making of Avowed, however all of them got here again to the core observe of coaching testers on instruments utilized by designers.
Although the workforce nonetheless used traditional “black field” testing to verify for bugs organically rising within the recreation, this “white field” technique introduced pleasure and collaboration to what could be a grinding area in recreation growth.
As Benefield concluded, “By pairing the QA mindset of ‘how do I get this to interrupt’ with the designer mindset of ‘how do I get this to work,’ we’re permitting these folks to work intently to one another, and we get a greater product in a shorter period of time.”
GDC and Recreation Developer are sibling organizations underneath Informa.