Good and Bad Benchmarks

On the Governance Functions of Proper Benchmarking

A portfolio’s benchmark is a tool that helps a client (asset owner) perform the essential governance function of monitoring the management of its portfolio. While a good benchmark will make this governance function easier, a bad benchmark will make it more difficult.

This paper explains what a “good benchmark” is. Like portfolios, benchmarks are highly personal, and one size rarely fits all. When every portfolio implementation and governance structure is different, a “good benchmark” for one client may be a “bad” one for another. Understanding what makes a “good benchmark” is essential for asset owners – especially those who delegate the implementation of their investment policy.

There are many different ways of constructing benchmarks, and they all purport to answer slightly different questions. We suggest that when a client has delegated implementation of its investment policy to another party like an advisor, the most important question a benchmark can answer is how well the client’s investment policy is being implemented. In that scenario, only one benchmark design accomplishes the task.

Defining Roles

Every client and portfolio has two essential elements or roles: an owner of the investment policy, and an implementer of the investment policy. The party that has responsibility for these two roles can be different, depending on the investment oversight and governance structure chosen by the asset owner:

  1. The Non-Discretionary Model. An asset owner (client) can retain responsibility for both investment policy and implementation of the policy. This is the case when the client manages its own portfolio – implements its own policy – or hires an advisor to help it do so in a non-discretionary capacity. As an advisor, we call this the “Non-Discretionary Model,” but its logic applies equally to clients who do not employ an advisor at all. Responsibility for both investment policy and implementation of the policy reside with the client.
  1. The Discretionary Model. The client can retain responsibility for policy, and delegate responsibility for the implementation of policy. This is the case when the client hires a discretionary investment advisor or Outsourced Chief Investment Officer (OCIO) to manage the portfolio. In this model, it is the client’s responsibility to articulate investment policy, and the advisor’s responsibility to implement that policy with a portfolio.
  1. The Hybrid Model. The client can retain responsibility for policy, and partially delegate responsibility for implementation of policy. This is the case for a client who delegates, for example, responsibility for trading and rebalancing but not manager selection to its advisor.
  1. The Pooled Model. The client can outsource both policy and implementation of the policy. This is the case when the client joins a “pooled” investment structure alongside other similar investors, like a community foundation. This is the only scenario where the client outsources its investment policy to a third party; the only element of policy that the client retains is the selection of the pooled investment.

Every institutional portfolio we have seen follows one of these governance and oversight structures. The choice of an optimal benchmark depends on what question the benchmark purports to answer – and the answer may be different based on each of these models.

What Different Types of Benchmarks Exist?

A portfolio’s benchmark consists of two essential elements: the indexes represented, and the weights applied to those indexes. We have seen several styles of benchmarks that vary based on these two variables.

So-called “Actual Allocation” benchmarks typically combine the benchmarks of each of the underlying investment managers in the portfolio, at weightings that float based on each manager’s actual moving proportion of the portfolio. This type of benchmark can be calculated entirely without reference to investment policy or the Investment Policy Statement (IPS) – the only inputs to its calculation are from the actual portfolio itself.

This type of benchmark does a great job of helping a client evaluate the implementer’s manager selection decisions, at the exclusion of all other variables that would determine the portfolio’s return. Especially for clients who have adopted the Non-Discretionary Model, this can be a question worth answering. But because it does not reference the Investment Policy, an Actual Allocation benchmark does a poor job answering the main question of how well the client’s Policy was implemented, in total. Constructing a total portfolio’s benchmark using the underlying benchmarks of the funds or managers in the portfolio implies that the total portfolio’s benchmark will change when the underlying funds do. This only makes sense to the extent that manager selection is thought of as “policy,” not “implementation of policy” – as it sometimes can be, in the Non-Discretionary model, where these decisions are not delegated.

“Target Allocation” benchmarks combine benchmarks representing the asset classes represented in the client’s Investment Policy Statement, weighted by fixed weightings expressed in that document’s strategic target asset allocation. This method measures the translation of Investment Policy to the implementation of the actual portfolio in its totality. It captures asset allocation differences relative to the strategic target, rebalancing decisions, manager selection, implementation frictions, and everything else. Unlike an Actual Allocation benchmark, it is not possible to calculate this type of benchmark without the Investment Policy (Statement).

What Question is the Benchmark Answering?

As a client outsources more responsibilities to a third party, we suggest both that benchmarking becomes more important, and that the range of acceptable benchmark methodologies narrows considerably. Let us consider the Non-Discretionary and Discretionary models above, as bookends of the discretion spectrum, in terms of what benchmarks are appropriate:

  1. The Non-Discretionary Model. In this governance structure, nothing is delegated, and the client is benchmarking itself. The client may be interested in its manager selection ability, to the exclusion of its asset allocation decisions. It may be interested in evaluating implementation except for the “frictions” (e.g., time out of market) that can arise from implementing illiquid investments. Or it may be interested in how well it has implemented its own investment policy, in totality. A variety of appropriate benchmarking approaches exist here, depending on what questions clients seek to answer with them. Both benchmarks that reference the Investment Policy and those that do not may be appropriate in this context.
  2. The Discretionary Model. In this structure, the client has hired a third party to implement its investment policy. In a Discretionary Model process, there is a “handoff” of responsibilities from the client (owner of investment policy) to the implementer of the portfolio (the advisor) – the benchmark’s job is to continuously examine the efficacy of that handoff. We suggest that how well the investment policy has been implemented, in its totality, is the most essential and relevant question that a benchmark can answer for the client. Only benchmarks that reference the Investment Policy are appropriate in this context.

Because clients have a greater need to oversee (benchmark) third-party implementers of their investment policies than themselves, the rest of this paper will explore optimal benchmarking for the Discretionary Model. – e.g., when clients have retained responsibility for investment policy, but delegated the implementation of that policy to a third-party advisor or OCIO.

What a Benchmark Measures, and Doesn’t Measure, in the Discretionary (OCIO) Model

When clients entrust Sellwood with the responsibility of managing their portfolios, it is generally under the “Discretionary Model” outlined above, where the client retains responsibility for investment policy but delegates implementation of that policy to Sellwood. We implement portfolios for clients using the following framework:

  1. We first assess the client’s unique objectives and constraints. We want to understand what the client is trying to accomplish with their investment portfolio, and what specific relevant risks they face in doing so. We document these objectives and constraints in the client’s uniquely personalized Investment Policy Statement.
  2. We work with the client to design a customized strategic policy portfolio that best addresses the client’s unique needs, objectives, risk tolerance, and constraints. We help the client document this strategic policy (target) portfolio in the Investment Policy Statement. While we help the client with this document, proper governance dictates that the client always must be in full control of it. The advisor should never have permission to modify the client’s Investment Policy Statement. It documents essential direction from the client to the advisor.
  3. Then we implement a portfolio for the client, consistent with their Investment Policy Statement.

The process flows from the client’s real-world circumstances, to Investment Policy, to portfolio. From a broader governance perspective, the process flows from “things the client is responsible for” to “things the advisor is responsible for”:

The client is responsible for investment policy; the advisor is responsible for implementing investment policy. A properly constructed benchmark critically examines the link between client responsibility and advisor responsibility – it evaluates, and measures the ongoing efficacy of, the handoff of responsibility from client to advisor, or the translation of “investment policy” to “implementation of investment policy.”

In simpler terms, when a client has delegated implementation of its investment policy, the purpose of a portfolio’s benchmark is to measure how well the client’s Investment Policy has been implemented.

Knowing the benchmark’s purpose, we can then construct the benchmark to suit that purpose.

A “Good Benchmark” for a Discretionary Advisor

While there are many ways to construct a total portfolio’s benchmark, only one method truly measures how well investment policy has been implemented – a total portfolio benchmark that implements the strategic target portfolio articulated in the Investment Policy Statement, in the most straightforward and lowest-cost way possible. This benchmark will be calculated using fixed, strategic target portfolio weights, and reasonable, investable index proxies for each asset category in the strategic target. There should be an intimate, unbroken link between the client’s investment policy and the advisor’s benchmark. The benchmark should flow directly from the policy. It should not be possible to calculate this benchmark without reference to the policy.

A sample benchmark that fits this framework would be as follows:

Strategic Policy Target in Client’s Investment Policy Statement

Benchmark Calculation

40% US Equity 40% Russell 3000 Index
20% International Equity 20% MSCI ACWI ex US Index
30% Investment-Grade Fixed Income 30% Bloomberg US Aggregate Bond Index
10% Public Real Estate 10% NAREIT Equity Index

Note that the benchmark’s calculation percentages match the Investment Policy Statement’s strategic target weights, and the benchmarks are good representatives of each broad asset class being strategically allocated to. The intimate link between the client’s IPS and the portfolio’s benchmark is preserved with this calculation methodology. The benchmark represents the policy and therefore the client’s objectives.

We have sometimes seen more granular strategic benchmarks, both in the Strategic Policy Target documented in the IPS and the benchmark that flows from it (for example, 30% large-cap US equity and 10% small-cap US equity, rather than 40% US equity). We suggest that the proper framing for this decision is whether the allocation is truly a “policy” allocation. If the policy decision is to have 40% of the portfolio in US equity, then the benchmark should represent that decision. If the policy decision is to have a 30%/10% mix of large- and small-cap US equities, then both the IPS policy target and the benchmark should reflect that objective. On the other hand, if the large/small-cap portfolio mix is an implementation decision rather than a policy, then it should not be captured in the benchmark. It should be expressed instead as performance differences between the portfolio and the policy benchmark.

If the goal is to assess a discretionary advisor’s implementation of a client’s investment policy, a portfolio’s benchmark should match the simplest-possible implementation of the Investment Policy’s strategic target. It is possible to make a benchmark more complicated than this, but not possible to make it better.

Benchmarking Cannot Be Delegated

It is important that the construction of the benchmark not be delegated to the advisor. While a good advisor will always assist the client in drafting the client’s Investment Policy Statement, it is essential that the client retain control of the document. A well-constructed Investment Policy Statement will give the advisor clear direction on what tools it can use to design the portfolio, and in what proportions. It will also express a clear benchmark, which is an important yardstick for measuring the advisor’s performance in implementing the Policy.

We have seen many Investment Policy Statements that do not state what the portfolio’s benchmark is, but instead offer a loose description of what types of benchmarks may be acceptable. This invites bad benchmarking, not to mention sloppy governance. Asking the advisor (implementer of investment policy) to design their own benchmark is like asking a student to write and grade their own spelling test. This is why we insist that the Investment Policy Statement, which is always controlled by the client, clearly articulate the portfolio’s benchmark. This is not to say that a good advisor shouldn’t assist a client with drafting their IPS; only to say that control over – responsibility for – the document should always rest with the client.

Benchmarks should err on the side of simplicity in their construction. They need to be well understood by both the client and the implementer of their portfolio. A good benchmark is so clearly articulated in the Investment Policy that a reader could calculate the benchmark return by hand (with available index data).

What Does Success Look Like?

With a good benchmark in place, the portfolio’s results have a reference point for comparison. Then a framework for evaluating the portfolio, and the implementer of the portfolio, can be introduced.

To be clear: a target-weighted, strategic policy benchmark constructed as outlined above will represent a very good portfolio — a portfolio that will satisfy the client’s objectives all by itself. It should be a very difficult benchmark to beat, but beating it should not necessarily be the objective.

If the purpose of the portfolio is to meet the client’s objectives, then a portfolio that matches the return of the (properly constructed) benchmark, net of investment fees, is a success. Evaluating a customized portfolio is a different exercise from evaluating an active manager. Unlike the case of an active investment manager, where we are seeking a higher return than a benchmark in exchange for a higher-than-benchmark fee, a good advisor or OCIO should be designing a portfolio that prioritizes reliable delivery of the client’s specific objectives. While delivering a higher return than the objectives (expressed in the benchmark) can be rewarding, doing so isn’t typically the point, and pursuit of higher return introduces perverse incentives for a portfolio that strays from its intended purpose and introduces unwelcome risks to the portfolio. When we sit down with clients and help them frame their goals for the portfolio, excess return is rarely one of those goals. Delivery of their unique objectives always is.

It is also important to align evaluation horizon with the investment horizon. We typically design long-term portfolios for long-term investors. Judging a portfolio’s return versus a benchmark on anything less than a multi-year horizon, ideally on a rolling basis, is likely to be counterproductive.

Best Practices

Every client, portfolio, IPS, and governance structure are a little different – but the principles of benchmark design apply equally to all. We see the following as best practices:

  • The client (asset owner), not the implementer of policy, must control the benchmark. Having the total portfolio’s benchmark calculation methodology very clearly articulated in the IPS accomplishes this. The calculation methodology should be so clear that any third party should be able to calculate the benchmark return using nothing more than the IPS, independent index return information, and a hand calculator.
  • When a client has chosen to hire a third party to implement their Investment Policy (e.g., an advisor with investment discretion), then it is essential to construct the total portfolio benchmark using the indexes and fixed policy weights articulated in the client’s Investment Policy Statement. The purpose of the portfolio benchmark should be to help the client evaluate implementation of their Investment Policy, not just the performance of a portfolio in the abstract. Insisting on this methodology best preserves the essential link between Investment Policy and implementation of that Policy.
  • If other benchmark methods are used, the client should have a clear understanding of specifically what those benchmarks are capturing, and what they are leaving out – and ideally select benchmarks that most appropriately answer their essential questions. For example, an “Actual Allocation” index, based on actual manager benchmarks at their actual, real-time weightings, will measure only the aggregated performance of those managers, and never the effects of asset allocation differences relative to a target, gaps left in structure due to manager selection, the overall design of manager structures within each asset class, the benefits or costs of rebalancing, implementation frictions, etc.
  • When in doubt, a “Target Allocation” benchmark, using fixed weights and indexes from the IPS, is a good choice for any governance model. It is the utility player of portfolio benchmarks.


What is the purpose of a benchmark? Different governance structures imply different questions that a benchmark may answer. What party is being evaluated? For what tasks and responsibilities? Are there principal/agent problems that a benchmark can address – or a lack thereof that a benchmark should not?

When an asset owner has delegated responsibility for managing a portfolio (implementing policy), we believe that a portfolio benchmark’s primary purpose is to assess how well the advisor is implementing the Investment Policy. There should be an intimate, unbreakable relationship between the client’s Investment Policy and the portfolio’s benchmark. The benchmark should always come directly from the Investment Policy Statement, and it should not be possible to calculate a benchmark without reference to the Investment Policy Statement. Exceptions to this rule should be very rare and thoughtfully considered.

Contemplating benchmark construction can often feel technical, and asset owners may be tempted to outsource the decision to the “experts.” But it really is essential for the asset owner to retain full control and understanding of their portfolio’s benchmark, lest allowing the portfolio’s implementer to grade their own spelling test, so to speak. If the primary purpose of a benchmark is to provide a governance mechanism, then determining the benchmark must remain the client’s responsibility. (A good advisor will always still assist the client in discharging this responsibility.)

Benchmarking a total portfolio is an essential governance function for any asset owner, and the most appropriate benchmark for any portfolio reflects the governance relationship between the client, who is responsible for Investment Policy, and the implementer of that Investment Policy (most typically, an outside hired advisor). Proper benchmark design can solve principal/agent problems between these two parties, but improper benchmark design can introduce them.

*   *   *   *   *

Appendix: What About Illiquid Investments?

Some portfolios contain illiquid investments, and it can be tempting to design a total portfolio benchmark differently to account for this illiquidity. Whether the benchmark should change (away from the optimal framework articulated above) depends on who controls the timing of the cash flows into and out of these investments.

Open-ended (sometimes called “core”) private real estate funds are a good example of an investment where we would not advocate changing the benchmark to account for illiquidity, as tempting as it can be to do so. Private real estate funds typically offer quarterly liquidity, and sometimes investors must enter a queue to either enter or exit the fund. These practical constraints make it difficult to maintain consistent targeted exposure to the asset class, especially when transitioning between funds; it is not unusual to sit “out of the market” for a quarter or two, if the timing of one fund’s redemption doesn’t perfectly align with another fund’s entry.

We have seen some investors prefer to calculate an “actual allocation” benchmark for their portfolio to account for this logistical difficulty. Under this method of creating a total portfolio benchmark, the underlying indexes are weighted using their actual weights at any point in time, rather than using the strategic target weights from the Investment Policy. The effect of this calculation is to have the benchmark unallocated to private real estate while the portfolio is.

This benchmarking approach does not measure the efficacy of the portfolio’s implementation of Investment Policy. The Policy calls for consistent exposure to real estate, or at least for returns that are comparable to consistent exposure to real estate — high enough to overcome the logistical drag inherent in choosing to employ private real estate funds. Imposing that logistical drag on the benchmark as well as the portfolio inappropriately breaks the link between Policy and implementation. In our example of private real estate funds, the portfolio implementer chose to implement the client’s investment policy in a logistically challenging way. The drag arising from that logistical challenge should be expressed in return deviation from benchmark just as much as the higher return associated with selecting a superior fund would.

At the same time, there are some illiquid investments, like private equity, where the logistics are entirely out of the client’s or the portfolio manager’s hands, because the very nature of the investment involves cash flows whose timing is directed by the fund manager, not by the client or their advisor. For portfolios with private markets investments like these, we work with clients to design their Investment Policies to acknowledge this limitation and identify the placeholder location for the assets ultimately designated for private markets, elsewhere in the client’s portfolio. Knowing this information, we can design a total portfolio benchmark that still expresses the client’s Investment Policy with high fidelity. The link between Policy and benchmarking remains unbroken.