Efficient frontier analysis

Revision as of 16:29, 4 April 2004 by N8chz (talk | contribs) (attempt at a fairly nontechnical intro to subject)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

efficient frontier analysis

The efficient frontier is a w:convex hull of the a set of data points. It is a concept that has been in circulation for a long time, and has been used in a wide variety of disciplines. For this reason, it is known by many names, and many nomenclatures for describing its various intricacies. Synonyms are numerous, and include:

  • efficiency frontier
  • efficient set
  • Pareto optimal set
  • non-dominated set
  • convex hull
  • multiple objective optimization (MOO)

Efficient frontier analysis (EFA) is used to negotiate "optimal" compromises between "competing" objectives. The classic textbook definition of EFA is found in financial textbooks and is based on the "tradeoff" between risk (called "beta" or covariance between a given issue and "the rest of the market") and reward (called yield) expected from various available issues of financial paper (stocks, bonds etc.) In the textbook example, it is assumed that investors wish to minimize beta (i.e. risk) and maximize yield. In such textbooks, beta is the abscissa (horizontal axis) and yield is the ordinate. The convex hull of the data points consists of those points that can't be enclosed by connecting other points (dots) with line segments. The portion of the convex hull that is of interest when one is interested in maximizing the vertical axis variable while minimizing the other is the upper left (possibly referred to as "northwest") portion. An intuitive test of whether a particular point is in this upper left frontier would be to "move the origin" to the point in question. If the point is "non-dominated" within the larger set, then there will be no points in the fourth (southeast) quadrant after moving the origin in this way. Think of "non-dominated" as the opposite of "dominated." If one candidate's parameters are "superior in every way" to another's, one might say the former "dominates" the latter. The EFA method applies to optimization problems in other fields, too. A two variable EFA might define the efficient set as the upper right, lower right or lower left convex hull depending on which variables one wishes to minimize or maximize. When negotiating between more than two objectives, of course, the number of efficiency frontiers is more than four. Needless to say, EFA relies on a number of assumptions:

  • The variables being analyzed are quantitative (well ordered).
  • The design criteria unambiguously prefer either larger or smaller values for any particular variable.

The applicability of EFA to Consumerium activities may be limited, or even irrelevant. There seems to be a decided preference for non-quantitative over quantitative criteria in describing Consumerium norms and priorities. The reasoning behind this seems to include a desire to emphasize an ethical agenda rather than ask shallowly commercial questions such as "is the price of X commensurate with its 'specifications?'"

The present author has included this information on EFA in Consumerium's pages, anyway, because it seems reasonable to believe that some criteria of interest to Consumerium may be approximated by "yes or no" type questions...

  • Does X incorporate animal products?
  • Are documented deviations from International Law associated with X?

For the sake of argument, let's assume that the preferred answer to both these questions is "no." We can associate the word "no" with the number 0 and "yes" with 1. In this case we have the "order property," and consider the lower left frontier efficient. If there are no data points for which the answers are unkown, then there are four possible locations in the graph. If the location (0,0) is inhabited, then its inhabitants would comprise the efficient set. If it is uninhabited, then either (1,0) and (0,1) are uninhabited, in which case the entire set is efficient (and also inefficient, of course). Otherwise, the efficient set consists of the (1,0) and (0,1) points combined.

In many cases, the answer to yes/no questions (or for that matter questions with numeric answers) can be "unknown." This possibility must be accounted for. One might by default assume that "maybe" belongs between "yes" and "no," but the positioning of "maybe" depends on the situation. For example, for research activities, one might wish to identify and patch "holes" in Consumerium's knowledge base. In such a case, an efficiency-based "queueing system" for picking factoids for further investigation, the "maybe" value might be prioritized above the other two. A description of such a strategy is given in my essay "efficient shopping lists:"

http://geocities.com/n8chz/esl.htm

Another use Consumerium might have for EFA would concern information volunteered to Consumerium (or to the public domain at large) by entities, concerning their own products. In such a scheme (certainly in the public domain), which "variables" of interest are made public can vary between suppliers. If Consumerium holds "transparency" as a "good" (as seems to be the case), supplied values should perhaps be preferred over unsupplied in some cases. In the previous sentence, the word "transparency" refers to a general preference for more (public and accurate) information over less.

This is one reason why I think Consumerium should consider not running on wiki alone, but consider including a database facility. A database can offer the ability to index columns with ordered domains, and create a truly endless variety of ad-hoc queries, as well as the ability to save and refine (and most importantly, share) these queries as "views." Creation of these queries can mix and match all manner of...

  • identification of variables one wishes to minimize or maximize
  • whether to rank "null" cells (often signifying unknown data) above or below "non-null" ones
  • what design objectives to prioritize over others

Queries can be fine-tuned to a variety of purposes and Consumerium activities, while the data in the underlying base tables at any given time represents the current "state" of Consumerium research. Keeping these base tables free of redundancies, contradictions, and general disinformation can become a formidable challenge if distributed database methods are chosen, but a distributed database may allow...

  • a less centralized project structure.
  • possible opportunities to avoid having to deal with main$tream financiers to finance large projects within Consumerium, by simply keeping projects of the "server" type smallish.
  • possible opportunities to leverage existing methods for file sharing and distributed computing à la Seti@Home.