ENH: New link line item ordering algorithm

This change introduces a new algorithm for link line construction.  The
order it computes always begins with the exact link line specified by
the user.  Dependencies of items specified by the user are tracked, and
those that are not already satisified by the line are appended to it at
the end with minimal repeats.  This restores the behavior of CMake 2.4
and below while still fixing some of its bugs.  See issue #7546.
This commit is contained in:
Brad King 2008-08-27 10:21:57 -04:00
parent 012e4c4f68
commit 816ee0f83c
2 changed files with 249 additions and 161 deletions

View File

@ -33,10 +33,10 @@ This file computes an ordered list of link items to use when linking a
single target in one configuration. Each link item is identified by
the string naming it. A graph of dependencies is created in which
each node corresponds to one item and directed eges lead from nodes to
those which must *precede* them on the link line. For example, the
those which must *follow* them on the link line. For example, the
graph
C -> B -> A
A -> B -> C
will lead to the link line order
@ -65,11 +65,11 @@ lists:
The explicitly known dependencies form graph edges
X <- Y , X <- A , X <- B , Y <- A , Y <- B
X -> Y , X -> A , X -> B , Y -> A , Y -> B
We can also infer the edge
A <- B
A -> B
because *every* time A appears B is seen on its right. We do not know
whether A really needs symbols from B to link, but it *might* so we
@ -93,12 +93,12 @@ considering these dependency lists:
The explicit edges are
X <- Y , X <- A , X <- B , X <- C , Y <- A , Y <- B , Y <- C
X -> Y , X -> A , X -> B , X -> C , Y -> A , Y -> B , Y -> C
For the unknown items, we infer dependencies by looking at the
"follow" sets:
A: intersect( {B,Y,C} , {C,B} ) = {B,C} ; infer edges A <- B , A <- C
A: intersect( {B,Y,C} , {C,B} ) = {B,C} ; infer edges A -> B , A -> C
B: intersect( {Y,C} , {} ) = {} ; infer no edges
C: intersect( {} , {B} ) = {} ; infer no edges
@ -107,50 +107,10 @@ libraries should not depend on them.
------------------------------------------------------------------------------
Once the complete graph is formed from all known and inferred
dependencies we must use it to produce a valid link line. If the
dependency graph were known to be acyclic a simple depth-first-search
would produce a correct link line. Unfortunately we cannot make this
assumption so the following technique is used.
The original graph is converted to a directed acyclic graph in which
each node corresponds to a strongly connected component of the
original graph. For example, the dependency graph
X <- A <- B <- C <- A <- Y
contains strongly connected components {X}, {A,B,C}, and {Y}. The
implied directed acyclic graph (DAG) is
{X} <- {A,B,C} <- {Y}
The final list of link items is constructed by a series of
depth-first-searches through this DAG of components. When visiting a
component all outgoing edges are followed first because the neighbors
must precede it. Once neighbors across all edges have been emitted it
is safe to emit the current component.
Trivial components (those with one item) are handled simply by
emitting the item. Non-trivial components (those with more than one
item) are assumed to consist only of static libraries that may be
safely repeated on the link line. We emit members of the component
multiple times (see code below for details). The final link line for
the example graph might be
X A B C A B C Y
------------------------------------------------------------------------------
The initial exploration of dependencies using a BFS associates an
integer index with each link item. When the graph is built outgoing
edges are sorted by this index.
This preserves the original link order as much as possible subject to
the dependencies. We then further preserve the original link line by
appending items to make sure all those that might be static libraries
appear in the order and multiplicity that they do in the original
line.
After the initial exploration of the link interface tree, any
transitive (dependent) shared libraries that were encountered and not
included in the interface are processed in their own BFS. This BFS
@ -159,6 +119,60 @@ They are added to the link items with a mark indicating that the are
transitive dependencies. Then cmComputeLinkInformation deals with
them on a per-platform basis.
The complete graph formed from all known and inferred dependencies may
not be acyclic, so an acyclic version must be created.
The original graph is converted to a directed acyclic graph in which
each node corresponds to a strongly connected component of the
original graph. For example, the dependency graph
X -> A -> B -> C -> A -> Y
contains strongly connected components {X}, {A,B,C}, and {Y}. The
implied directed acyclic graph (DAG) is
{X} -> {A,B,C} -> {Y}
We then compute a topological order for the DAG nodes to serve as a
reference for satisfying dependencies efficiently. We perform the DFS
in reverse order and assign topological order indices counting down so
that the result is as close to the original BFS order as possible
without violating dependencies.
------------------------------------------------------------------------------
The final link entry order is constructed as follows. We first walk
through and emit the *original* link line as specified by the user.
As each item is emitted, a set of pending nodes in the component DAG
is maintained. When a pending component has been completely seen, it
is removed from the pending set and its dependencies (following edges
of the DAG) are added. A trivial component (those with one item) is
complete as soon as its item is seen. A non-trivial component (one
with more than one item; assumed to be static libraries) is complete
when *all* its entries have been seen *twice* (all entries seen once,
then all entries seen again, not just each entry twice). A pending
component tracks which items have been seen and a count of how many
times the component needs to be seen (once for trivial components,
twice for non-trivial). If at any time another component finishes and
re-adds an already pending component, the pending component is reset
so that it needs to be seen in its entirety again. This ensures that
all dependencies of a component are satisified no matter where it
appears.
After the original link line has been completed, we append to it the
remaining pending components and their dependencies. This is done by
repeatedly emitting the first item from the first pending component
and following the same update rules as when traversing the original
link line. Since the pending components are kept in topological order
they are emitted with minimal repeats (we do not want to emit a
component just to have it added again when another component is
completed later). This process continues until no pending components
remain. We know it will terminate because the component graph is
guaranteed to be acyclic.
The final list of items produced by this procedure consists of the
original user link line followed by minimal additional items needed to
satisfy dependencies.
*/
//----------------------------------------------------------------------------
@ -180,6 +194,9 @@ cmComputeLinkDepends
// Assume no compatibility until set.
this->OldLinkDirMode = false;
// No computation has been done.
this->CCG = 0;
}
//----------------------------------------------------------------------------
@ -191,6 +208,7 @@ cmComputeLinkDepends::~cmComputeLinkDepends()
{
delete *i;
}
delete this->CCG;
}
//----------------------------------------------------------------------------
@ -244,7 +262,6 @@ cmComputeLinkDepends::Compute()
// Compute the final ordering.
this->OrderLinkEntires();
this->PreserveOriginalEntries();
// Compute the final set of link entries.
for(std::vector<int>::const_iterator li = this->FinalLinkOrder.begin();
@ -402,9 +419,9 @@ void cmComputeLinkDepends::HandleSharedDependency(SharedDepEntry const& dep)
int index = lei->second;
LinkEntry& entry = this->EntryList[index];
// This shared library dependency must be preceded by the item that
// listed it.
this->EntryConstraintGraph[index].push_back(dep.DependerIndex);
// This shared library dependency must follow the item that listed
// it.
this->EntryConstraintGraph[dep.DependerIndex].push_back(index);
// Target items may have their own dependencies.
if(entry.Target)
@ -555,12 +572,12 @@ cmComputeLinkDepends::AddLinkEntries(int depender_index,
// Add a link entry for this item.
int dependee_index = this->AddLinkEntry(item);
// The depender must come before the dependee.
// The dependee must come after the depender.
if(depender_index >= 0)
{
if(!this->EntryList[dependee_index].IsFlag)
{
this->EntryConstraintGraph[dependee_index].push_back(depender_index);
this->EntryConstraintGraph[depender_index].push_back(dependee_index);
}
}
else
@ -713,7 +730,7 @@ void cmComputeLinkDepends::InferDependencies()
for(DependSet::const_iterator j = common.begin(); j != common.end(); ++j)
{
int dependee_index = *j;
this->EntryConstraintGraph[dependee_index].push_back(depender_index);
this->EntryConstraintGraph[depender_index].push_back(dependee_index);
}
}
}
@ -745,7 +762,7 @@ void cmComputeLinkDepends::DisplayConstraintGraph()
e << "item " << i << " is [" << this->EntryList[i].Item << "]\n";
for(NodeList::const_iterator j = nl.begin(); j != nl.end(); ++j)
{
e << " item " << *j << " must precede it\n";
e << " item " << *j << " must follow it\n";
}
}
fprintf(stderr, "%s\n", e.str().c_str());
@ -759,30 +776,55 @@ void cmComputeLinkDepends::OrderLinkEntires()
// the same order in which the items were originally discovered in
// the BFS. This should preserve the original order when no
// constraints disallow it.
cmComputeComponentGraph ccg(this->EntryConstraintGraph);
Graph const& cgraph = ccg.GetComponentGraph();
if(this->DebugMode)
{
this->DisplayComponents(ccg);
}
// Setup visit tracking.
this->ComponentVisited.resize(cgraph.size(), 0);
this->CCG = new cmComputeComponentGraph(this->EntryConstraintGraph);
// The component graph is guaranteed to be acyclic. Start a DFS
// from every entry.
for(unsigned int c=0; c < cgraph.size(); ++c)
// from every entry to compute a topological order for the
// components.
Graph const& cgraph = this->CCG->GetComponentGraph();
int n = static_cast<int>(cgraph.size());
this->ComponentVisited.resize(cgraph.size(), 0);
this->ComponentOrder.resize(cgraph.size(), n);
this->ComponentOrderId = n;
// Run in reverse order so the topological order will preserve the
// original order where there are no constraints.
for(int c = n-1; c >= 0; --c)
{
this->VisitComponent(ccg, c);
this->VisitComponent(c);
}
// Display the component graph.
if(this->DebugMode)
{
this->DisplayComponents();
}
// Start with the original link line.
for(std::vector<int>::const_iterator i = this->OriginalEntries.begin();
i != this->OriginalEntries.end(); ++i)
{
this->VisitEntry(*i);
}
// Now explore anything left pending. Since the component graph is
// guaranteed to be acyclic we know this will terminate.
while(!this->PendingComponents.empty())
{
// Visit one entry from the first pending component. The visit
// logic will update the pending components accordingly. Since
// the pending components are kept in topological order this will
// not repeat one.
int e = *this->PendingComponents.begin()->second.Entries.begin();
this->VisitEntry(e);
}
}
//----------------------------------------------------------------------------
void
cmComputeLinkDepends::DisplayComponents(cmComputeComponentGraph const& ccg)
cmComputeLinkDepends::DisplayComponents()
{
fprintf(stderr, "The strongly connected components are:\n");
std::vector<NodeList> const& components = ccg.GetComponents();
std::vector<NodeList> const& components = this->CCG->GetComponents();
for(unsigned int c=0; c < components.size(); ++c)
{
fprintf(stderr, "Component (%u):\n", c);
@ -793,14 +835,19 @@ cmComputeLinkDepends::DisplayComponents(cmComputeComponentGraph const& ccg)
fprintf(stderr, " item %d [%s]\n", i,
this->EntryList[i].Item.c_str());
}
NodeList const& ol = this->CCG->GetComponentGraphEdges(c);
for(NodeList::const_iterator oi = ol.begin(); oi != ol.end(); ++oi)
{
fprintf(stderr, " followed by Component (%d)\n", *oi);
}
fprintf(stderr, " topo order index %d\n",
this->ComponentOrder[c]);
}
fprintf(stderr, "\n");
}
//----------------------------------------------------------------------------
void
cmComputeLinkDepends::VisitComponent(cmComputeComponentGraph const& ccg,
unsigned int c)
void cmComputeLinkDepends::VisitComponent(unsigned int c)
{
// Check if the node has already been visited.
if(this->ComponentVisited[c])
@ -812,49 +859,126 @@ cmComputeLinkDepends::VisitComponent(cmComputeComponentGraph const& ccg,
this->ComponentVisited[c] = 1;
// Visit the neighbors of the component first.
NodeList const& nl = ccg.GetComponentGraphEdges(c);
for(NodeList::const_iterator ni = nl.begin(); ni != nl.end(); ++ni)
// Run in reverse order so the topological order will preserve the
// original order where there are no constraints.
NodeList const& nl = this->CCG->GetComponentGraphEdges(c);
for(NodeList::const_reverse_iterator ni = nl.rbegin();
ni != nl.rend(); ++ni)
{
this->VisitComponent(ccg, *ni);
this->VisitComponent(*ni);
}
// Now that all items required to come before this one have been
// emmitted, emit this component's items.
this->EmitComponent(ccg.GetComponent(c));
// Assign an ordering id to this component.
this->ComponentOrder[c] = --this->ComponentOrderId;
}
//----------------------------------------------------------------------------
void cmComputeLinkDepends::EmitComponent(NodeList const& nl)
void cmComputeLinkDepends::VisitEntry(int index)
{
assert(!nl.empty());
// Include this entry on the link line.
this->FinalLinkOrder.push_back(index);
// This entry has now been seen. Update its component.
bool completed = false;
int component = this->CCG->GetComponentMap()[index];
std::map<int, PendingComponent>::iterator mi =
this->PendingComponents.find(this->ComponentOrder[component]);
if(mi != this->PendingComponents.end())
{
// The entry is in an already pending component.
PendingComponent& pc = mi->second;
// Remove the entry from those pending in its component.
pc.Entries.erase(index);
if(pc.Entries.empty())
{
// The complete component has been seen since it was last needed.
--pc.Count;
if(pc.Count == 0)
{
// The component has been completed.
this->PendingComponents.erase(mi);
completed = true;
}
else
{
// The whole component needs to be seen again.
NodeList const& nl = this->CCG->GetComponent(component);
assert(nl.size() > 1);
pc.Entries.insert(nl.begin(), nl.end());
}
}
}
else
{
// The entry is not in an already pending component.
NodeList const& nl = this->CCG->GetComponent(component);
if(nl.size() > 1)
{
// This is a non-trivial component. It is now pending.
PendingComponent& pc = this->MakePendingComponent(component);
// The starting entry has already been seen.
pc.Entries.erase(index);
}
else
{
// This is a trivial component, so it is already complete.
completed = true;
}
}
// If the entry completed a component, the component's dependencies
// are now pending.
if(completed)
{
NodeList const& ol = this->CCG->GetComponentGraphEdges(component);
for(NodeList::const_iterator oi = ol.begin(); oi != ol.end(); ++oi)
{
// This entire component is now pending no matter whether it has
// been partially seen already.
this->MakePendingComponent(*oi);
}
}
}
//----------------------------------------------------------------------------
cmComputeLinkDepends::PendingComponent&
cmComputeLinkDepends::MakePendingComponent(unsigned int component)
{
// Create an entry (in topological order) for the component.
PendingComponent& pc =
this->PendingComponents[this->ComponentOrder[component]];
pc.Id = component;
NodeList const& nl = this->CCG->GetComponent(component);
// Handle trivial components.
if(nl.size() == 1)
{
this->FinalLinkOrder.push_back(nl[0]);
return;
// Trivial components need be seen only once.
pc.Count = 1;
}
else
{
// This is a non-trivial strongly connected component of the
// original graph. It consists of two or more libraries
// (archives) that mutually require objects from one another. In
// the worst case we may have to repeat the list of libraries as
// many times as there are object files in the biggest archive.
// For now we just list them twice.
//
// The list of items in the component has been sorted by the order
// of discovery in the original BFS of dependencies. This has the
// advantage that the item directly linked by a target requiring
// this component will come first which minimizes the number of
// repeats needed.
pc.Count = 2;
}
// This is a non-trivial strongly connected component of the
// original graph. It consists of two or more libraries (archives)
// that mutually require objects from one another. In the worst
// case we may have to repeat the list of libraries as many times as
// there are object files in the biggest archive. For now we just
// list them twice.
//
// The list of items in the component has been sorted by the order
// of discovery in the original BFS of dependencies. This has the
// advantage that the item directly linked by a target requiring
// this component will come first which minimizes the number of
// repeats needed.
for(NodeList::const_iterator ni = nl.begin(); ni != nl.end(); ++ni)
{
this->FinalLinkOrder.push_back(*ni);
}
for(NodeList::const_iterator ni = nl.begin(); ni != nl.end(); ++ni)
{
this->FinalLinkOrder.push_back(*ni);
}
// Store the entries to be seen.
pc.Entries.insert(nl.begin(), nl.end());
return pc;
}
//----------------------------------------------------------------------------
@ -896,57 +1020,3 @@ void cmComputeLinkDepends::CheckWrongConfigItem(std::string const& item)
}
}
}
//----------------------------------------------------------------------------
static bool cmComputeLinkDependsNotStatic(cmTarget* tgt)
{
return (tgt &&
tgt->GetType() != cmTarget::STATIC_LIBRARY &&
tgt->GetType() != cmTarget::UNKNOWN_LIBRARY);
}
//----------------------------------------------------------------------------
void cmComputeLinkDepends::PreserveOriginalEntries()
{
// Skip the part of the input sequence that already appears in the
// output.
std::vector<int>::const_iterator in = this->OriginalEntries.begin();
std::vector<int>::const_iterator out = this->FinalLinkOrder.begin();
while(in != this->OriginalEntries.end() &&
out != this->FinalLinkOrder.end())
{
cmTarget* tgt = this->EntryList[*in].Target;
if(cmComputeLinkDependsNotStatic(tgt))
{
// Skip input items known to not be static libraries.
++in;
}
else if(*in == *out)
{
// The input and output items match. Move on to the next items.
++in;
++out;
}
else
{
// The output item does not match the next input item. Skip it.
++out;
}
}
// Append the part of the input sequence that does not already
// appear in the output.
while(in != this->OriginalEntries.end())
{
cmTarget* tgt = this->EntryList[*in].Target;
if(cmComputeLinkDependsNotStatic(tgt))
{
// Skip input items known to not be static libraries.
++in;
}
else
{
this->FinalLinkOrder.push_back(*in++);
}
}
}

View File

@ -131,15 +131,33 @@ private:
// Ordering algorithm.
void OrderLinkEntires();
std::vector<char> ComponentVisited;
std::vector<int> ComponentOrder;
int ComponentOrderId;
struct PendingComponent
{
// The real component id. Needed because the map is indexed by
// component topological index.
int Id;
// The number of times the component needs to be seen. This is
// always 1 for trivial components and is initially 2 for
// non-trivial components.
int Count;
// The entries yet to be seen to complete the component.
DependSet Entries;
};
std::map<int, PendingComponent> PendingComponents;
cmComputeComponentGraph* CCG;
std::vector<int> FinalLinkOrder;
void DisplayComponents(cmComputeComponentGraph const& ccg);
void VisitComponent(cmComputeComponentGraph const& ccg, unsigned int i);
void EmitComponent(NodeList const& nl);
void DisplayComponents();
void VisitComponent(unsigned int c);
void VisitEntry(int index);
PendingComponent& MakePendingComponent(unsigned int component);
void DisplayFinalEntries();
// Preservation of original link line.
// Record of the original link line.
std::vector<int> OriginalEntries;
void PreserveOriginalEntries();
// Compatibility help.
bool OldLinkDirMode;