A Large-scale Survey about the Essential Attributes of Software Engineering Expertise

(1)

A Large-scale Survey about

the Essential Attributes of Software Engineering Expertise

Paul Luo Li,

*

Andrew Begel

Microsoft

One Microsoft Way, Redmond

{paul.li,andrew.begel}@microsoft.com

Andrew J. Ko

The Information School* University of Washington, DUB Group

[email protected]

ABSTRACT

Software engineers are fundamental to software engineering. Their importance belies a deficient understanding about the essential attributes of software engineering expertise. We present the results of a large-scale world-wide survey of 1,926 expert Microsoft engineers across 67 countries to learn about the importance of 54 previously identified attributes of great software engineers. To help interpret our quantitative findings, we followed up with brief email interviews with 77 of the survey respondents. We found that the two key drivers of the importance ratings were having the mental capacity to handle complexity and embracing life-long learning. We also found unexpected relationships between the respondents’ ratings and their level of experience, education history, and cultural background, for which we provide possible explanations. Relative to rankings in prior research, we found the essential attributes of software engineering expertise to encompass internal attributes of the engineers’ personality and decision-making abilities, in addition to external attributes of the software they produce and their interactions with teammates. We discuss the implications of our results for researchers, educators, and practitioners.

Categories and Subject Descriptors

D.2.9 [Management]: Productivity, Programming teams

General Terms

Human Factors, Management

Keywords

Software engineers, expertise, productivity, teamwork

1. INTRODUCTION

Software engineers are essential to the engineering of good software. But what are the personality characteristics, behaviors, and knowledge that most enable a software engineer to write and maintain good software? These questions are at the foundation of nearly every part of our world’s rapidly growing software ecosystem: universities want to train great engineers, employers want to hire and retain great engineers, and young engineers want to become great.

What little research we do have about the important attributes of software engineering expertise is incomplete, indirect, and potentially unsound. Many software engineering researchers use productivity as a measure of expertise. In 1968, Sackman et al. [28] conducted one of the first studies comparing engineers, finding that completion times can vary as much as 28:1 between

the best and worst engineers. Gugerty and Olson [14] described the differences between novices and experts as completing more tasks in a set amount of time, completing tasks faster, or making fewer mistakes. Since then, productivity has been used as outcome measures in many areas of software engineering research, e.g. tools [8] and reliability engineering [4]. However, while relevant, productivity is not the sole criterion of expertise; it does not cover how the software is produced or contextual technical aspects of its construction (e.g. architecture and design). Many studies demonstrate that software engineering is a

sociotechnical undertaking. In 1998, Kelley [18] concluded a study, started in 1985, of observing and interviewing over a thousand engineers, including software engineers at HP and Bell Labs. He derived nine effective working strategies of star performers: “blazing trails”, “knowing who knows”, “proactive self-management”, “getting the big picture”, “the right kind of followership”, “teamwork as joint ownership of a project”, “small-I leadership”, “street smarts”, and “show and tell.” While we will not detail these here, all of these attributes concern how

the software is produced within a team context.

Research examining real-world software engineering practices report similar findings. For example, in 1994, Perry et al. [27] used time-diaries of 14 engineers at Bell Labs to find that though coding consumed most of the time, roughly half of the time is occupied by non-coding tasks including support, high-level design analysis, low-level testing, and planning. Research into activities of software engineers, e.g. communications [6][19], conflict resolution [13], and bug triage/assignment [2][15][3], all suggest that collaborating with teammates is important to the creation of software.

From a technical skills perspective, the ACM Computing Curricula [1] suggest that simply having running code is not sufficient. The curricula—developed over multiple years by a joint committee of academics and educators with reviews from industry [20]—lists the top technical skills for software engineers as “Programming Fundamentals,” “Software Design,” “Software Modeling and Analysis,” “Software Verification and Validation,” and “Project Management.” These topics indicate that internal technical attributes of the software, particularly its architecture, are also important aspects of expertise.

While many researchers have focused on external attributes of engagement with others and technical skills, meta-analyses of the research findings about human factors [11] and reports from industry experts suggest that internal attributes of an engineer’s personality and decision-making processes also part of expertise. McConnell [23] state that effective developers, in addition to having technical skills, also have personality traits including being humble about their intelligence, curious (continuously developing and improving one’s skills), intellectually honest, communicative and collaborative with others, disciplined and systematic in tackling problems, improvement seeking, and effective at

(2)

developing good engineering process habits. In a New York Times’ interview [9], Bock states that typically at Google, a software engineer’s college education ceases to be important after several years. Therefore, a software engineer’s ability to learn on the job is considered critical. He further claims that judgment, inspiration, and creativity are all more important than technical knowledge.

The volume of information we have available belies the lack of a rigorous understanding about software engineering expertise. First and foremost, since these different kinds of attributes have not been examined together, we do not know or understand their relative importance. This leads us to ask:

RQ1. What do experienced engineers believe are the essential attributes of the best engineers and why?

In addition to needing to understand the degree to which experienced engineers believe various attributes are important, we also need to understand how these opinions vary by context. Various research studies suggest amount of experience [5], gender [22], education background [10], cultural background [7], type of software [16], and the number of engineers working together [26] may all affect how important an attribute is and is perceived to be. Yet, there is little understanding of their effects (if any at all). Therefore, we also seek to understand:

RQ2. How and why do perceptions about attributes of software engineering expertise vary due to contextual factors?

Finally, a simple ranking of attributes is insufficient; we must also examine differences with prior knowledge in the literature. Rationalizing the differences can help us to better understand the gaps in our knowledge and to identify directions for progress. Therefore, we further seek to understand:

RQ3. How and why do our rankings differ other rankings in prior work?

Since work prior to ours has examined numerous aspects of software engineering expertise, we wanted to combine and validate all of those aspects into a comprehensive list of attributes with contextual and differentiable definitions [25]. Therefore, in our prior work, we began our research arc with an interview study with 59 experienced Microsoft engineers. We developed a

contextualized, mutually-distinct, and comprehensive list of 53 attributes of software engineering expertise [21] (we later added does due diligence beforehand, for a total of 54 attributes). The 54 attributes are shown in Figure 1, and are grouped using the model developed in our interview study. Table 1 (found in Section 4) has a fuller description of the attributes. The attributes roughly fall into four groups: 1) internal attributes of the engineer’s personality, 2) internal attributes of the engineers’ decision-making abilities, 3) external attributes of the impact that engineers have on their teammates, and 4) external attributes of the impact that engineers have on their software product. The decision-making attributes generally concern engineers expanding their knowledge as well as creating and updating their decision-making models (recognizing situations, knowing alternative courses of action, likely outcomes, and values of outcomes). The external attributes focus on applying emotional intelligence and decision-making models to their software, their teammates, and the potentially millions of users and stakeholders they serve via their software engineering efforts.

With the attributes in hand, in this study, we conducted a large-scale worldwide quantitative survey of experienced Microsoft software engineers, along with qualitative follow-up interviews. Overall, we received survey responses from 1,926 engineers covering 67 countries (~13% of all experienced Microsoft software engineers world-wide), and we obtained follow-up email interviews with 77 experienced engineers (out of 111 asked). We specifically targeted very experienced software engineers (in Microsoft parlance, engineers with titles such as Technical Fellow

and Distinguished Engineers) who typically have 15+ years of experience and are responsible for critical technical areas around the company. These data enables us to meaningfully contribute multifaceted and nuanced understanding about software engineering expertise under real-world conditions.

2. Method

To answer our research questions, we proceed in two phases. We first surveyed experienced Microsoft engineers about the importance of the 54 attributes. We followed up with email interviews with selected respondents to help understand and interpret the survey findings.

(3)

2.1 Survey

With access to the Microsoft company directory (the first and third authors are employees), we identified expert engineers based on their job titles. We randomly selected full-time employees in software development roles (e.g. “senior software development engineer”), and then scoped to senior and principal promotion levels. Based on best practices in human expertise literature [12], this approach ensured that the respondents had been recognized (via promotion or hiring) as having attained some desired level of software engineering expertise.

As with most companies, there are more engineers with less experience and fewer engineers with extensive experience. To ensure that we obtained information from the most experienced engineers, we created two sampling strata: experienced—titles up to “Senior Software Development Lead”—and very experienced—titles from “Principal Software Development Engineer” and higher. Overall, we obtained 1,926 survey responses: 825 responses from experienced engineers (~7% of all

experienced engineers at Microsoft worldwide; of the 1,802 solicited this was a response rate of 46%), and 1,101 responses from veryexperienced engineers (~35% of all very experienced

engineers worldwide; of the 2,496 solicited this was a response rate of 44%).

The anonymous survey was hosted on a Microsoft Research website. We emailed engineers asking them to participate, offering a report of the findings and entry into a gift certificate raffle as incentives. We sent reminder emails after the first week and after one month. The survey was open for ~3 months, from Dec 2014 to Feb 2015.

After explaining the purpose of the study and their right not to participate, the survey asked questions about the respondent’s demographics, experience level, and current work context. Table 2 (in Section 4) lists the contextual factors and their distributions within our sample.

In anticipation of respondent fatigue, we presented questions about the attributes in four groups, corresponding to the four groups in Figure 1—18 on personality, 9 on decision making, 18 on interacting with teammates, and 9 on the software produced. This approach reduced mental fatigue from context switching between different types of attributes. To address ordering bias and to enable analysis of incomplete results, we randomized the ordering of the four groups, as well as randomizing (separately) the ordering of the attributes within each group.

We framed each attribute description around an engineer with that attribute. For example, for the hardworking attribute (defined

in Table 1), the description was: “A hardworking developer is willing to work more than 8 hr days to deliver the software product.”

To ensure that respondents accurately understood the attributes, we first piloted the survey with five engineers using the think-aloud protocol. This led to multiple changes to match the thinking and understanding of Microsoft engineers. For example, the term ‘software engineer’ was changed to ‘developer’ in order to differentiate people on engineering teams that did not write code (e.g. testers and program managers). Supporting quotes were added (where appropriate) for confusing attributes. And the attribute does due diligence beforehand was added; feedback for the asks for help attribute (originally, encompassing both seeking out others for needed information and learning as much one can in order to ask intelligent questions) was that many developers did the former without doing the latter, and therefore the attribute should be distilled into two attributes.

To get a holistic and absolute rating of importance, we asked: “If an experienced developer—whose primary responsibility is developing software—did not have this attribute, could you still consider them a great developer?” Respondents were given six choices: “Cannot be a great developer if they do not have this”, “Very difficult to be a great developer without this, but not impossible”, “Can be a great developer without this, but having it helps”, “Does not matter if they do not have this, it is irrelevant”, “A great developer should not have this; it is not good”, and “I do not know”. Based on piloting the survey, we found that this negatively phrased question was easier for respondents to answer and was better at eliciting assessment of the attribute’s holistic importance. Figure 2 shows a screenshot of the survey page for the hardworking attribute.

With the pilot feedback, the final survey structured questions like the one in Figure 2, which is the question for the hardworking attribute in the deployed survey. Each question about each attribute was structured and phrased in a similar manner, allowing respondents to quickly read and respond.

We deployed the survey in two waves, going out to an initial set of 200 developers (~100 in each experience strata) to look for problems. We monitored the number of ‘I do not know’ responses, time spent on the questions, where respondents were dropping out, and responses to the concluding open-ended question. After finding no issues, we proceeded deployed the survey to the larger sample.

The survey took respondents a median of 29 minutes to complete, with some outliers due to respondents leaving and returning to the survey at a later time. Overall, 1,634 respondents (84.3%) completed the survey and 292 provided ratings for at least one attribute. We used both complete and partial data in our analysis because each attribute had an equal chance of being seen (due to the randomization) and the assessment of importance of each attribute was independent of other attributes (due to how we asked the question about importance).

2.2 Analysis

To determine the most and least important attributes, we could not compare the average ratings of each attribute because the ratings were ordinal and our response levels were not centered (because they included four positive ratings and only one negative rating). Instead, we ranked the attributes by comparing the ratings distribution of each attribute to the ratings distribution of every other attribute, counting the number of distributions for which an

(4)

attribute’s distribution was significantly higher. We used the Mann-Whitney rank-order test to measure significance, which can be used to compare ordinal data and can be used when the number of observations are not equal [17]. For each attribute, we therefore performed 53 one-sided Mann-Whitney tests, one test against every other attribute. We then calculated the number of statistically significant pairwise comparisons with other attributes at α=.05. Finally, we ranked the attributes based on the number of statistically significant tests. For example, the ratings distribution of the most important attribute in our set was statistically higher than all 53 other attributes based on the one-sided Mann-Whitney rank-test. Table 1 in the Section 4 shows the ratings distributions for all 54 attributes.

To analyze the relationship between contextual factors and the attribute ratings, we used Ordinal Logistic Regression. We assessed the first order relationships between the contextual factors and the ratings for each attribute. To account for performing multiple tests, we used the Benjamini & Hochberg False Discovery Rate (FDR) adjustment at the q =.1 level. Due to being optional, only 1,512 respondents provided information on

Age. To maximize statistical power, we first fitted models with all factors to assess the effects of Age, and then fitted separate models without Age, to assess the effects of other factors.

2.3 Follow-up Email Interview

To help interpret the rankings and the relationships from the analysis, we emailed respondents to ask for further insights into their survey responses. We asked about the highest ranked attributes (the five bolded attributes atop the ranked list in Table 1 in the results section), the potentially detrimental attributes (the two attributes with the highest percentage of ‘A great developer should not have this; it is not good’ ratings, in bold at the bottom of Table 1), and the attributes which were significantly affected by context (the relationships listed in Table 2 of the results section). For the highest ranked attributes and positive relationships, we picked respondents that rated the attribute the highest and had the largest positive differences with their median ratings, aiming to avoid respondents that rated all attributes highly. For the detrimental attributes and negative relationships, we picked respondents that rated the attribute the lowest and had the largest negative difference with their median ratings.

In our survey, 771 respondents indicated that they would be willing to discuss their responses further. We sent follow-up emails to 111 of these engineers, receiving replies from 77

informants (69.4% response rate). We tried, where reasonable, to include questions about multiple attributes into a single email to uncover insights that spanned multiple relationships. We systematically analyzed the responses to gain understandings, selected representative quotes, and then asked the informants’ permission to share anonymized quotes.

3. FINDINGS

Due to limited space for discussions, we focus on the top 5 (the highest ranked) and bottom 2 (potentially detrimental) attributes, and three most interesting relationships (based on our opinions). For the quotes, “SDE” in the title that follows the quote refers to Software Development Engineer.

3.1 Highest and Lowest Ranked Attributes

The ordered list of attributes in Table 1 shows the most important attributes at the top and the least important ones at the bottom. Rating distributions are shown in the third column and the number of other ratings distributions higher than is in the fourth column.

The most important attribute was pays attention to coding details (ranked 1, higher ratings distribution than 53 attributes, 63.1% highest rating). Respondents explained that first and foremost, engineers judged other engineers by their code. Therefore, engineers that could not get the basics correct were not respected:

“Another strong driver is the respect of our peers, which you won’t get by writing shoddy code…” -Principal SDE

Second, software produced by engineers could be used in many ways, often unforeseen by the engineer, and therefore, engineers needed to pay attention to the details to avoid costly problems:

“This code is performance critical, compatibility sensitive, and is used in a huge variety of contexts. If a developer fails to handle an error, some customer will hit it, and we will likely need to issue a hotfix; if a developer implements an inefficient algorithm (N^2 is not ok) we will need to hotfix it; if a developer consumes memory excessively in some environment we will need to hotfix it, etc.” -Principal SDE

This may have been especially important at Microsoft, where software products are often platforms, components, and/or used in contexts unforeseen by the engineer. This thinking also underlies mentally capable of handling complexity (ranked 2, higher ratings distribution than 52 attributes, 54.2% highest ratings) being an important attribute. Informants felt that great engineers need to be able to think through complex situations to write good code.

“Most useful software has to be highly tolerant of incorrect usage by the user/caller above it, and interacting with the supporting code below it… Developers who cannot handle complexity tend to always be fixing bugs or having to do “another” release to take into account situations they had not thought of… Why didn’t they think of those situations? Because they did not have the mindset of complexity.” -Principal SDE

Informants indicated that continuously improving (ranked 3, higher ratings distribution than 49 attributes, 51.0% highest ratings) and open-minded (ranked 5, higher ratings distribution than 49 attributes, 49.4% highest ratings) were important attributes because the software industry moves quickly, and so great engineers need to be open to new ideas and to keep learning:

“As the technology/technique evolves and better tools come along, the open-minded developer picks up on these and is willing to apply them to be more productive/effective… without an effort to continuously improve…developers will soon find themselves lagging behind the industry and/or state-of-the-art with technology and technique.” -Principal SDE Lead

This thinking also contributed to honest (ranked 4, higher distribution than 49 attributes, 50.8% highest ratings) being important. Informants indicated that great engineers needed to acknowledge mistakes in order to make the right decisions for themselves and their teams:

“Lying to yourself is much easier in my profession than in any other profession I know…It’s so easy to think that you know the topic and miss (subconsciously ignore) evidence that contradicts your “knowledge”. Great developer never makes that mistake. He simultaneously knows a lot and questions everything he knows.” -Principal SDE

Regarding the honest attribute, informants also discussed developers being dishonest about the situation to the detriment of others and felt strongly that such behaviors were bad:

(5)

(6)

“This has happened to me any number of times in my 20 years here: I needed a particular component, and a team which had such a component would “lie” to me about its availability and maturity in order to get me to be a user and justify their own existence to management…To serve themselves, they worked “against” me.” -Principal SDE

Two attributes received ratings of “A great developer should not have this; it is not good” from more than 5% of the respondents: trading favors (ranked 54, higher distribution than 0 attributes, 6.0% lowest rating) and hardworking (ranked 53, higher distribution than 1 attribute, 5.0% lowest rating). We followed up on these attributes because none of the attributes were expected to be detrimental; they were based on our prior interview study, which focused exclusively on attributes of great engineers [21]. In follow-ups, informants indicated that these attributes were not inherently bad, but likely reflected bad situations. For hardworking, informants believe that needing to work more than a

9-5 pm may be indicative of poor planning or unsustainable engineering practices:

“workload for a developer is a function of management and planning happening above that developer. Usually long working hours are needed, because the planning was not good, the decisions made during the project lifecycle were bad, the change management wasn’t ‘agile’ enough” -SDE2

This may have been more salient for companies outside of the US and in the games industry, accounting for the negative relationship between hardworking and having work experiences in the UK and Other countries (row 16 in Table 2):

“I’ve definitely seen this first hand, as people steadily become less productive over time and tend to make more short-term decisions… Having previously worked in both games and visual effects, where the “death march” is not uncommon” -Senior SDE

(7)

For the trades favors attribute, informants believed that needing to do personal favors might reflect a biased decision-making culture:

“They should be totally separated, else what I have seen is we tend to make biased decisions and opinions about others.” -SDE2

Furthermore, needing undocumented processes to get things done might indicate poorly organized practices, making it harder for engineers to operate effectively:

“Once you “trade favors” you are getting into personal give and take and builds institutional memory around a couple of nodes in a people graph and possibly not visible outside of that relationship so could easily get lost when one or both leave the project/organization.” -Principal SDE

3.2 Influence of Contextual Factors

Table 2 summarizes the statistically significant relationships between contextual factors and attribute ratings based on Ordered Logistic Regression (OLR) with False Discovery Rate (FDR) correction at the q=0.1 level. The first column states the factor; statistically significant relationships are in column 4, with (+) indicating a positive relationship (the presence of the factor or higher values of the factor is related to higher ratings for the attribute) and (–) indicating a negative relationship (presence of the factor or higher values of the factor is related to lower ratings for the attribute).

Table 3 provides details on attributes that had statistically significant relationships with three specific types of contextual factors we found interesting: level of experience, educational background, and prior work experience in China. The factors are sorted based on their odds-ratio effect, in column seven. The higher the odds ratio, the greater impact the contextual factor had on respondents’ rating of the attribute (odds ratios less than 1 had a negative impact on ratings). For categorical factors, we computed the odds ratio based the odds of a higher rating for the attribute (all else held equal) if the factor was present. For numerical factors, the odds ratio was based on the ratio between the 75th_{and 25}th_{percentile. For example, if a respondent had}_work

experience in China, then the odds of a high rating for the trades favors attribute (at any rating level) are 2.309 times greater, all other factors held constant.

Experience Level. We followed-up on the first 4 factors (Is very experienced, Age, Years as a professional developer, Years at Microsoft, and Employment at software companies) together as

level of experience, and we discuss them collectively here. This was reasonable because the factors all aim to measure the same underlying construct of experience and are highly correlated with each other. The statistically significant relationships between level of experience and the eight attributes (first 4 rows in Table 2, corresponding to rows 1, 3, 12-19 in Table 3) were all positive

(i.e. a higher experience level corresponded to higher ratings for the attributes).

Informants suggested four underlying reasons for the observed positive relationships with level of experience. First (and most obviously), informants felt that engineers with higher experience level placed more importance on contributing to the business because higher-level engineers were evaluated based on the contributions they make towards the goals of the organization; this encompassed the relationships with the attributes: aligned with organizational goals and knowledgeable about the customer and business:

“Our evaluation system(s) have always emphasized developers that deliver on the organizational goals of the company… more experienced developers are likely to understand, that alignment with the company goals delivers greater rewards.” -Principal SDE Manager

Second, informants felt that engineers with higher level of experience valued delivering results, encompassing the relationships with the attributes: hardworking, desires to turn ideas into reality, and executes. To make meaningful contributions, engineers needed to deliver software:

“20 years of experience managing engineers in startups and big companies alike…No matter how talented, sharp minded and skillful one is, if they are not hardworking (i.e. willing to work long hours to meet deadlines/deliverables) they will not succeed…” -Partner SDE Lead

Third, informants felt that engineers with higher level of experience placed more importance on gaining knowledge and making smarter decisions because they have gone through multiple releases and have experienced the pain of mistakes; this

(8)

encompassed the relationships with the attributes: knowledgeable about tools and building materials, knowledgeable about software engineering processes, and makes informed trade-offs:

“Newer devs (myself included when I began) trying to either prove themselves or chase after cool add-on things… and it just adds complexity…Understanding costs beyond the initial release is something incredibly important, and something hard for even seasoned devs to remember sometimes.” -Senior SDE

“ ‘Knowledgeable’ and ‘Informed’ only come from experience. This is all about breadth and exposure to lots of situations that let you generalize to new ones… you learn to be less confident that you immediately know the best answer to a problem. You actually become more flexible and are willing to trade off among goals you might not even have considered earlier in your career… It takes a while for most people to really appreciate the big picture and to be able to make decisions based on a broader context than the one they naturally work in.” -Architect “Software engineering processes are there for a reason… The

more experience you are, the more you saw the pros and cons of process first hand.” -Principal SDE Lead

Finally, a natural corollary to the previous finding about gaining knowledge and making smarter decisions, informants felt that engineers with higher level of experience better understood that they needed to be continuously improving to stay ahead. Experienced engineers recognized that if they did not continue to learn, then they might become ‘obsolete’.

“Nobody can stay at the top without “improving” because the next wave of technology will soon obsolete whatever was at the top.” -Partner SDE

Educational background. Having a Masters and/or PhD degree had unexpected negative relationships with attributes related to giving and receiving help (for Masters, see row 10 in Table 2, row 20 in Table 3; for Ph.D., see row 11 in Table 2, last two rows, 21-22, in Table 3). Informants provided two interesting possible explanations. First, since a graduate degree is largely optional for success in the software industry, informants felt that engineers that get those degrees may be more intrinsically motivated than others. This may lead them to be less inclined to rely on others for help and motivation:

“They weren’t satisfied with the bare minimum of a bachelor’s degree… getting a master’s degree doesn’t really impact your paycheck very much in this industry… I think these people who seek knowledge… they want to find things out for themselves”

-Principal SDE

Second, informants suggested that engineers with graduate degrees were often hired as technical experts, such that they rarely had the opportunity to give or receive help:

“problems which either nobody has tried to solve before, everyone else has failed solving before, or handling some major sort of crisis… they operate under the assumption that there’s nobody to ask help from when there’s a crisis and they will need to be able to figure out the solutions themselves.” -Principal SDE

We examined this hypothesis by comparing the number of developers worked with in the past year (row 20 in Table 2) between engineers with and without advanced degrees. We found that the number of engineers worked with in the past year was statistically significantly less (α=.05) for both engineers with a

Masters degree (p-value=0.004, medians 12 and 15) and with a

Ph.D. degree (p-value =0.037, medians 11 and 15) using the

Mann-Whitney rank-test. These results support the hypothesis that engineers with advanced degrees work with fewer engineers.

Work experience in another country. The contextual factor

having work experience in China (summarized in row 16 of Table 2, corresponding to rows 3 and 4-11 in Table 4) reveals interesting cultural influences. For example, for the trading favors attribute, informants indicates that its ratings were influenced by the broader business context in China:

“Culturally there is a different perception… ‘guanxi’ [关系] it’s just a part of how business is done. Well of course, the best, the most successful are the ones that have those relationships. That would be a positive thing. There would, culturally, be a relationship in people’s mind that it would go hand in hand with any career or profession… even in an engineering context.”

-Principal SDE Manager

Furthermore, informants indicated that commonly accepted local practices and expectations also affected perceptions of the attributes’ importance:

“Systematic, I wouldn’t be surprise if that’s skewed… Part of it is culture. There’s just a daily grind of getting things done. People there would acknowledge that it doesn’t make sense; it’s just the way it works, why would you change it.” -Principal SDE Manager

4. DISCUSSION

In this section, we discuss underlying reasons for the rankings and relationships, as well as compare our rankings to rankings in prior work. We discuss the implications of the findings for other researchers, educators, and practitioners. We finish with a discussion of the threats to validity.

4.1 The Essential Attributes

In our first research question, we asked: what do experienced engineers believe are the essential attributes of the best engineers and why. Our analyses indicate that software engineering (at least at Microsoft) changes rapidly. Even foundational concepts can change over time, such that those that do not grow and evolve risk becoming obsolete. Consequently, it is not a specific set of knowledge or processes but rather the desire, ability, and capacity to learn that defines the best engineers. The theme of constant learning was prevalent throughout the survey and follow-up data; informants frequently indicated that greatness was

attained and maintained over time. This contributed to multiple associated attributes—honesty, open-minded, and continuously improving—being atop the rankings, as well as high rankings for numerous personality attributes related to learning and improving (see Section 4.1 and Table 1).

This insight has implications for both researchers and practitioners. While there is evidence that the best engineers at Microsoft need to embrace life-long learning, it is unclear whether software engineers at all organizations face similar pressures. We encourage others to replicate this study at organizations with different cultures (e.g. Facebook or Amazon) and that serve different incentives (e.g. finance companies where engineers develop software in support of the core business). This would help to better understand whether Microsoft is a special case or one that pervades the software engineering profession.

Results also indicate that engineering of software at its highest levels at Microsoft is a complicated and complex technical

undertaking. Teams need experienced engineers who are smart, technically savvy, and dedicated to finding and implementing solutions. Informants indicated that uncertainty and complexity

(9)

surround their software from underlying dependencies, system states, external callers, and/or partner components. Pays attention to coding to details and mentally capable of handling complexity being atop the rankings reflect this sentiment, as well as high rankings for other product and decision-making attributes (see Section 4.1 and Table 1). Informants felt that at the end of the day, an engineer needs to be able to make sense of the technical, business, and domain contexts to make meaningful progress. This insight raises several opportunities for invention. If engineers do not need to hold all of the information in their heads [24], but rather, need to be able to retrieve, interpret, and synthesize it, tools might help engineers more quickly and accurate understand complexities and uncertainties in the decisions they make.

4.2 Context Effects on Attribute Rankings

In our second research question, we asked: how and why do perceptions about attributes of software engineering expertise vary due to contextual factors. Looking at the influences of level of experience (see Section 4.2, Tables 2 and 3), we see that great engineers need to have real-world first-hand experiences. Many attributes sound good in theory or in isolation, but become unimportant when put into real-world contexts, amid competing concerns and hard deadlines. This sentiment underlies informants’ discussions of the relationships with aligned with organizational goals and knowledgeable about the customer and business— corresponding to identifying key objectives—as well as hardworking, desires to turn ideas into reality, and executes— corresponding to actually delivering the software product. Conversely, seemingly interesting side-projects or innocuous deviations from instituted practices can result in bad outcomes or incur significant engineering debt. It is not until an engineer experiences some real-world consequences of their choices that the costs and benefits of their decisions—and the attributes that they entail—become clear. This sentiment underlies the reasoning for the relationships with knowledgeable about tools and building materials, knowledgeable about software engineering processes, and makes informed trade-offs

This insight has implications for software engineering education and training. If software engineers must have first-hand experience, how can educators provide that experience? Or is it not the responsibility of academic training to provide this experience? Other professional fields, such as medicine, nursing, and dentistry, have mandatory apprenticeships, where aspiring candidates gain real-world hands-on experience as part of their academic experiences. To what degree would this pedagogical approach be feasible for software engineering? And what would be the advantages (and disadvantages) for young engineers, their employers, and the consumers of their software? To some extent, many students approximate this experience with multiple summer internships, as in the University of Waterloo’s highly regarded co-op program. Future work should investigate the extent to which these programs produce better engineers.

Results from analyzing the negative relationship between having an advanced degree (Masters and Ph.D.) and attributes associated with giving and getting help (see Section 4.2, Tables 2 and 3) indicate that the relationships are likely not due to graduate school education (i.e. graduate schools do not teach engineers to devalue giving and getting help). Rather it is likely due to the conditions that would lead one to pursue a graduate degree (a selection bias) and the conditions one enters after graduating (a survivorship bias). Informants indicated that a graduate degree is generally not seen as advantageous to an engineer’s career (probably in comparison to hands-on experience, per the previous discussion);

therefore, only those of an independent mindset choose to pursue higher education. Furthermore, since those with graduate degrees may be more likely to be placed in more innovation-oriented technical positions—corroborated by the statistically significant differences in the number engineers worked with in the past year—they may simply have less opportunity to get or to provide help, and therefore view it as less important to good engineering. Examining the differences in ratings from those with work experiences in China (see Section 4.2, Tables 2 and 3) reveal that many facets of the Chinese culture affect perceptions of software engineering expertise. The case of trading favors is especially interesting. It was the lowest rated attribute overall and, in many cases, was deemed detrimental. However, it received significantly higher ratings among engineers with work experience in China; follow-ups indicate that this difference may be inextricably tied to ‘guanxi’—关系: t he building of a network of mutually beneficial relationships, commonly found in modern Chinese business culture.

This insight may have implications for researchers and practitioners. Cultural differences impact engineering outcomes, and the conditions in which software development organizations should (or should not) adapt to local cultural norms (versus instituting organizational standards) are important considerations for practitioners with distributed engineering efforts. Future work should investigate additional cultural variations and their impact on effective software engineering.

4.3 Comparison with Prior Rankings

In our third and final research question, we asked: how do our rankings differ from other rankings in the literature and why? Our data indicate that important attributes of great software engineers encompass attributes of the software they produce (pays attention to coding details) and of their engagements with others (honest) as well as internal attributes of their personality (continuous improving and open-minded) and of their decision-making abilities (mentally capable of handling complexity). Furthermore, beyond the top 5, many of the top attributes are internal attributes of the engineer’s personality (e.g. executes, reliant, self-reflecting, persevering among the top 10).

These findings are both supportive of prior work and in conflict with it, while also adding nuance. For example, most of the attributes that Kelley [18] mentioned concerned teamwork being critical determinants of success, but our data showed that teamwork was viewed as one of the less important than a focus on learning and coding skills. When comparing our results to Bock’s New York Times’ interview [9], we confirmed that learning on the job is viewed as critical, but that engineers actually view technical skills as just as important. It may be that Bock’s belief about the importance of creativity, inspiration, and judgment emerges from experienced engineers having a breadth and depth of technical knowledge.

These insights have several implications for researchers and educators. In order to make progress towards better understanding and teaching software engineering expertise, we need help from researcher and educators to devise ways of measuring and assessing these internal attributes [11]. Research into better measurements would be foundational to advancements. Furthermore, educators may need to investigate approaches for teaching personality attributes that may not be reasonable in an academic setting, e.g. self-reliant and persevering.

Differences in the people examined for our study may have also contributed to variations in the rankings. Rankings in prior work

(10)

include opinions of academics and educators [1], researchers [19][18], and human resource professionals [16]. Our rankings are based solely on insights from engineers. In particular, we believe that the low ranking of attributes of interacting with teammates may because we only examined engineers. Future work will need to investigate why engineers did not believe teamwork skills were as critical as other attributes.

While we believe this difference contributes to more valid rankings, it raises interesting questions that may be worth further investigations from researchers. Since people in other roles, such as managers, testers, marketers, sales, designers, data scientists, etc., also make important contributions to the software engineering process and are directly impacted by it, future work should examine perceptions about software engineering expertise from these other perspectives from within organizations.

4.4 Threats to Validity

As with any empirical study, there are many threats to validity.

Construct validity issues may arise because the engineers may have interpreted the attributes differently from one another, especially those working in different countries and those for whom English is not their first language. We piloted the survey with five engineers, two of whom were non-native English speakers, using the think-out-lout protocol to verify accurate understanding of the descriptions. Furthermore, we added quotes to clarify the understanding of attributes. Finally, since the attribute descriptions were derived from interviews with Microsoft engineers, respondents likely shared common language and context.

The ambiguous nature of the term “software engineer” is another potential construct validity issue. In our pilots, we provided the ACM definition of a software engineer: “someone who develops software to be used by others” and asked the respondents how best to describe someone who did that task at Microsoft. Based on the feedback, we switched the wording from ‘software engineer’ to ‘developer’ in the survey.

There may also be construct validity issues in the measurements of the contextual factors. The concept of “engineers working together to construct the software” may be interpreted differently by engineers, especially managers that oversee but do not work directly with large engineering organizations. In addition, these experienced engineers may also have relationships (manager to subordinate) that are different in nature for other engineers (collaborators). The extensive experience of the respondents (the median respondent had worked for three different companies) may also have led to issues with accurately assessing work context. We asked about respondent’s current work context, but not prior work contexts, due to the necessity of keeping the survey as concise as possible. Their prior work context could have affected their ratings. The lack or the existence of statistically significant relationships may have been affected by our ability to capture the contextual factors accurately.

Internal validity issues may arise from several sources. First, the False Discovery Rate adjustment penalizes the number of statistically significant relationships by the number of relationships found. The numerous significant relationships with having work experience in India and China may have hidden other interesting relationships. Second, our analysis examined first-order relationships between the ratings and contextual variables. Second-order relationships may exist. We felt that this was appropriate given our current understanding (i.e. there was little prior research to support investigating any second order

relationships) and the amount of data available. Finally, we followed up with a small number of respondents, who are themselves a self-selected subset of the respondents; other interpretations of the attributes and relationships may exist.

External validity issues may also exist. First, we only sampled Microsoft engineers, which limits the external validity of our findings. However, since the attribute definitions were based on interviews with Microsoft engineers, by surveying a similar population, we enhanced the construct validity of our findings. Furthermore, since Microsoft hires top engineers from around the world, creates many innovative products (e.g. Windows, Office, Surface, HoloLens, etc.) and produces software used by billions of users around the world, its engineers represent an interesting, important, and relevant population. Second, we explicitly over-sampled very experienced engineers, which may introduce some bias. These very experienced engineers have thrived in the Microsoft setting, and may exhibit thinking and perspectives that are particularly well-suited to Microsoft. This may decrease the importance of some attributes that Microsoft engineers “take for granted” (e.g. hardworking) because engineers without those attributes would not be hired or would not survive to become experienced engineers. There may also be bias from experienced engineers leaving Microsoft. Some engineers (especially those that have been with the company since its early days) may have had the financial means to retire early, others may have left to pursue other interests (e.g. graduate degrees), and some may have moved to roles in other companies (e.g. join start-ups or establish their own companies). Nonetheless, we feel that having very

experienced engineers—many with titles indicating that they were experts and leaders in their domains—was important for having credible and insightful findings.

Finally, though we did not specifically sample for other attributes, we note that we received many responses from both female respondents (149 responses) and non-US respondents (351 responses); this should have allowed us to detect significant systematic differences due to related contextual factors, even with the FDR adjustment.

5. CONCLUSION

In this study, we have contributed a ranked list of attributes of software engineering expertise, as rated by experienced software engineers. In addition, we have examined relationships with a wide array of contextual factors, finding many variations in the perceived importance of expertise attributes due to experience, education, and culture. Furthermore, we provided rationalizations and explanations for the ratings and relationships, based on real-world experiences of engineers, grounding these differences in the perspectives of highly experienced, leading engineers.

With studies like ours and the future work that ours may inspire, we hope to better understand, support, and nurture a world in which the production of software is not viewed as a purely technical pursuit, but one deeply dependent on the cultivation of highly skilled, intelligent lifelong learners.

6. ACKNOWLEDGMENTS

Thanks to our survey participants. This work was supported in part by Microsoft, Google, and the National Science Foundation (NSF) under Grants CCF-0952733, CNS-1240786, and IIS-1314399. Any opinions, findings, conclusions or recommendations are those of the authors and do not necessarily reflect the views of NSF.

(11)

7. REFERENCES

1. ACM/IEEE-CS Joint Task Force on Computing Curricula (2015). Computer science curricula 2013.

http://dx.doi.org/10.1145/2534860, retrieved on March 13th_,

2015.

2. Anvik, J. and Murphy, G.C. (2007). Determining

implementation expertise from bug reports. Proceedings of the Fourth International Workshop on Mining Software Repositories, 298–308.

3. Aranda, J. and Venolia, G. (2009). The secret life of bugs: going past the errors and omissions in software repositories.

Proceedings of the IEEE 31st International Conference on Software Engineering, 298–308.

4. Barnard, J. and Price, A. (1994). Managing Code Inspection Information. IEEE Software 11, 2, 59–69.

5. Begel, A. and Simon, B. (2008). Novice software developers, all over again. Proceedings of the Fourth International Computing Education Research Workshop, 3–14. 6. Bertram, D., Voida, A., Greenberg, S., and Walker, R.

(2010). Communication, collaboration, and bugs: the social nature of issue tracking in small, collocated teams. ACM Conference on Computer Supported Cooperative Work, 291– 300.

7. Borchers, G. (2003). The software engineering impacts of cultural factors on multi-cultural software development teams. International Conference on Software Engineering, 540–545.

8. Bruckhaus, T., Madhavji, N.H., Janssen, I., and Henshaw, J. (1996). The impact of tools on software productivity. IEEE Software 13, 5, 29–38.

9. Bryant, A. (2013). In head-hunting, big data may not be Such a big deal. The New York Times.

http://www.nytimes.com/2013/06/20/business/in-head-hunting-big-data-may-not-be-such-a-big-deal.html, retrieved on March 15th_{, 2015.}

10. Carver, J.C., Nagappan, N., and Page, A. (2008). The impact of educational background on the effectiveness of

requirements inspections: an empirical study. IEEE Transactions on Software Engineering 34, 6, 800–812. 11. Cruz, S., da Silva, F.Q.B., and Capretz, L.F. (2015). Forty

years of research on personality in software engineering: A mapping study. Computers in Human Behavior 46, 94–113. 12. Ericsson, K.A., Krampe, R.T., and Tesch-romer, C. (1993). The role of deliberate practice in the acquisition of expert performance. Psychological Review 100, 3, 363–406. 13. Gobeli, D.H., Koenig, H.F., and Bechinger, I. (1998). Managing conflict in software development teams: A multilevel analysis. Journal of Product Innovation Management 15, 423–435.

14. Gugerty, L. and Olson, G.M. (1986). Debugging by skilled and novice programmers. SIGCHI Conference on Human Factors in Computing Systems, 171–174.

15. Guo, P.J., Zimmermann, T., Nagappan, N., and Murphy, B. (2011). “Not My Bug!” and other reasons for software bug report reassignments. ACM Conference on Computer Supported Work, 395-404.

16. Hewner, M. and Guzdial, M. (2010). What game developers look for in a new graduate  : Interviews and surveys at one

game company. ACM Technical Symposium on Computer Science Education, 275–279.

17. Hollander, M., Wolfe, D.A., and Chicken, E. (2013).

Nonparametric statistical methods. Wiley.

18. Kelley, R.E. (1999). How to be a star engineer. IEEE Spectrum 36, 10, 51–58.

19. Latoza, T.D., Venolia, G., and DeLine, R. (2006). Maintaining mental models: a study of developer work habits. International Conference on Software Engineering, 492–501.

20. Lethbridge, T.C., LeBlanc, R.J., J., Kelley-Sobel, A.E., Hilburn, T.B., and Diaz-Herrera, J.L. (2006). SE2004: Recommendations for undergraduate software engineering curricula. IEEE Software 23, 6, 19–25.

21. Li, P.L., Ko, A.J., and Zhu, J. (2015). What makes a great software engineer? International Conference on Software Engineering, to appear.

22. Margolis, J. and Fisher, A. (2003). Unlocking the clubhouse: Women in computing. The MIT Press.

23. McConnell, S. (2004). Code complete: A practical handbook of software construction. Microsoft Press.

24. Parnin, C. (2014). Supporting interrupted programming tasks with memory-based aids.

25. Patten, M.Q. (2002). Qualitative research and evalution methods. Sage Publications Inc..

26. Pendharkar, P.C. and Rodger, J.A. (2009). The relationship between software development team size and software development cost. Communications of the ACM 52, 1, 141– 144.

27. Perry, D.E., Staudenmeyer, N. A., and Votta, L.G. (1994). People, organizations, and process improvement. IEEE Software 11, July, 36–45.

28. Sackman, H., Erikson, W.J., and Grant, E.E. (1968). Exploratory experimental studies comparing online and offline programming performance. Communications of the ACM 11, 1, 3–11.