Thread Subject: Non-strict Convex Optimization
Hi MATLABERs,
I am attempting to do convex optimization on a set, C, which is convex, but not strictly so (that is, it is comprised of faces which are sections of hyperplanes, so in places it is linear and hence not strictly convex). The function I am looking to minimize (specifically to solve f(p,q) = zeros(n,1) ), can be most easily represented as a function of the tangent plane ("p") and the value of C ("q"). The two issues I am struggling with are: 1) The hyperplane faces require exact slopes in order to search over them, so numerical methods which approximate the slopes will have a hard time searching. 2) For a given slopes ("p"), there can be lots of consistent points on C ("q") (e.g. on a face). For a given value of q, there can be a number of values of p (e.g. on a vertex). Any advice on methods to look at which can solve this sort of problem? Thank you, Cory |
"Cory" wrote in message <jnucu3$1r0$1@newscl01ah.mathworks.com>...
>The two issues I am struggling with are: > > 1) The hyperplane faces require exact slopes in order to search over them, so numerical methods which approximate the slopes will have a hard time searching. > 2) For a given slopes ("p"), there can be lots of consistent points on C ("q") (e.g. on a face). For a given value of q, there can be a number of values of p (e.g. on a vertex). > > Any advice on methods to look at which can solve this sort of problem? Thank you, ================= Your description of what "p" represents isn't very clear and you haven't described what q represents at all. However, maybe you shouldn't be parametrizing in terms of p and q in the first place. The way you parametrize an optimization problem can affect its sensitivity to numerical problems. If you describe f(p,q) in more detail, maybe alternative and more stable formulations of the problem can be explored. |
"Cory" wrote in message <jnucu3$1r0$1@newscl01ah.mathworks.com>...
> Hi MATLABERs, > > I am attempting to do convex optimization on a set, C, which is convex, but not strictly so (that is, it is comprised of faces which are sections of hyperplanes, so in places it is linear and hence not strictly convex). The function I am looking to minimize (specifically to solve f(p,q) = zeros(n,1) ), can be most easily represented as a function of the tangent plane ("p") and the value of C ("q"). > > The two issues I am struggling with are: > > 1) The hyperplane faces require exact slopes in order to search over them, so numerical methods which approximate the slopes will have a hard time searching. > 2) For a given slopes ("p"), there can be lots of consistent points on C ("q") (e.g. on a face). For a given value of q, there can be a number of values of p (e.g. on a vertex). ============== I would also add that the two issues you've cited above don't sound all that unusual or problematic. So what if your solution lies exactly in one of the hyperplane faces? This is typically the case in linear and quadratic programming. Why should it create any greater complications for you? Iterative minimization algorithms never presume to find the exact solution, anyway. They only aim to converge to it. Similarly, why is it a big deal that your constraints remain satisfied when q is fixed and p is varied? It's the very same situation as when you have bound contraints, e.g. x(1)>=0 x(2)>=0 Along a certain edge of the feasible set, x(1) can be held fixed while x(2) varies. |
> Your description of what "p" represents isn't very clear and you haven't described what q represents at all.
Sorry, I think my description was vague. Let me attempt to be more precise: I have a set C which is the boundary of a (non-strict) convex set. You could imagine the boundary of a convex set, except instead of being curved, it has faces which are sections of hyperplanes (e.g. in 2D it would be a series of increasingly downward sloping lines). Say "q" is a point on this set and "p" is a plane tangent to C at q (e.g. in 2D "p" is the slope of of a line tangent at q; at vertices there are multiple possible p). I have a function f(q,p) and I want to solve for f(q,p) = zeros(n,1). What is the best method for doing so? |
"Cory" wrote in message <jnul0r$a0o$1@newscl01ah.mathworks.com>...
> > Your description of what "p" represents isn't very clear and you haven't described what q represents at all. > > Sorry, I think my description was vague. Let me attempt to be more precise: > > I have a set C which is the boundary of a (non-strict) convex set. You could imagine the boundary of a convex set, except instead of being curved, it has faces which are sections of hyperplanes (e.g. in 2D it would be a series of increasingly downward sloping lines). ================ Is the boundary non-curved everywhere, or is it partially curved? > Say "q" is a point on this set and "p" is a plane tangent to C at q (e.g. in 2D "p" is the slope of of a line tangent at q; at vertices there are multiple possible p). > > I have a function f(q,p) and I want to solve for f(q,p) = zeros(n,1). What is the best method for doing so? ============== You'll need to extend the definition of f(q,p) so that it can be evaluated in the interior of C as well as on the boundary. You must extrapolate it in such a way that no solution to f(q,p) = zeros(n,1) can exist in the interior. Once you do that, you can probably use FMINCON, assuming you have the Optimization Toolbox. |
> Is the boundary non-curved everywhere, or is it partially curved?
Non-curved everywhere. It's the convex hull of a finite set of vertices. > You'll need to extend the definition of f(q,p) so that it can be evaluated in the interior of C as well as on the boundary. You must extrapolate it in such a way that no solution to f(q,p) = zeros(n,1) can exist in the interior. > > Once you do that, you can probably use FMINCON, assuming you have the Optimization Toolbox. Alright, I'll have to think of a way to do that. Thanks! |
"Cory" wrote in message <jnumcb$g8l$1@newscl01ah.mathworks.com>...
> > Is the boundary non-curved everywhere, or is it partially curved? > > Non-curved everywhere. It's the convex hull of a finite set of vertices. > > > You'll need to extend the definition of f(q,p) so that it can be evaluated in the interior of C as well as on the boundary. You must extrapolate it in such a way that no solution to f(q,p) = zeros(n,1) can exist in the interior. > > > > Once you do that, you can probably use FMINCON, assuming you have the Optimization Toolbox. > > Alright, I'll have to think of a way to do that. Thanks! ================== So, you're minimizing over a bounded polyhedron C which can be expressed by matrix inequalities A*q<=b Then you could redefine f(p,q) over all of C as follows fnew(p,q)=f(p,q)+prod(b-A*q) The additional term has to be positive in the interior of C, so roots of fnew can still only be on the boundary. So now, you would use FMINCON to minimize this with q constrained to A*q<=b and p unconstrained. |
"Matt J" wrote in message <jnun8a$k94$1@newscl01ah.mathworks.com>...
> > Then you could redefine f(p,q) over all of C as follows > > fnew(p,q)=f(p,q)+prod(b-A*q) > > The additional term has to be positive in the interior of C, so roots of fnew can still only be on the boundary. So now, you would use FMINCON to minimize this with q constrained to A*q<=b and p unconstrained. =================== Sorry, I mean you would minimize objective(p,q)=norm(f(p,q))^2 + prod(b-A*q)*Weight; over q\in C={q:A*q<=b} and p \in R^n Assuming f(p,q) has a root on the boundary of C, this modified objective function will be minimized at such a root for a sufficiently large Weight>0. |
> Then you could redefine f(p,q) over all of C as follows
> > fnew(p,q)=f(p,q)+prod(b-A*q) Thanks for this suggestion. Unfortunately, C is not easy to write out in that form. It would have a very large number of vertices which, with dimensions > 2, would lead to far too vast an "A" term for it to be computationally feasible. C can be described in terms of an optimization process (e.g. I give it "p", the slopes of a hyperplane, and it is easy to determine the corresponding point(s) on C). Sorry for not being explicit about this earlier, but I was trying to keep my post short. |
"Cory" wrote in message <jnuo2b$o3c$1@newscl01ah.mathworks.com>...
> > Thanks for this suggestion. Unfortunately, C is not easy to write out in that form. It would have a very large number of vertices which, with dimensions > 2, would lead to far too vast an "A" term for it to be computationally feasible. > > C can be described in terms of an optimization process (e.g. I give it "p", the slopes of a hyperplane, and it is easy to determine the corresponding point(s) on C). > > Sorry for not being explicit about this earlier, but I was trying to keep my post short. ========================= I think we need more explicit information about what f() looks like and about the problem size (dimension of the space, number of vertices, etc...). |
> I think we need more explicit information about what f()
looks like and about the problem size (dimension of the space, number of
vertices, etc...).
Sounds good. I'm working with an economic model of supply and demand. The convex set C is the "supply" side. The supply comes from a set of "fields" which split their resources between various crops. For instance, a field might devote all its energy to corn and produce 10 corn. Or it might devote itself solely to wheat and produce 20 wheat. Or it could produce 0.4*10 corn and (1 - 0.4)*20 wheat. Each good has a price. Fields always produce crops which gets them the most money. In the above example, if p_corn = $3 and p_wheat = $1, the field produces all corn. If p_corn changes to $1, it produces all wheat. If p_corn changes to $2, it can produce both corn and wheat in any legal proportion. So, C, the set of possible supplies, can be thought of as the set of optimal production possibilities (given a price). It is the boundary of a (non-strict) convex set. f is the "demand" side. Different agents own the fields and collect their revenue (i.e. price*quantity produced). They spend this revenue on the same goods that are being produced. f tells me how much total demand there is, given the quantities produced and prices. The goal is to find a point on C where the supply (q) is equal to the demand (a function of q and the tangent plane, i.e. the price, p). The number of fields is high (~10^6), the number of goods is smaller (~10^3). |
"Cory" wrote in message <jnuv1f$p21$1@newscl01ah.mathworks.com>...
> > I think we need more explicit information about what f() looks like and about the problem size (dimension of the space, number of vertices, etc...). > > Sounds good. I'm working with an economic model of supply and demand. The convex set C is the "supply" side. The supply comes from a set of "fields" which split their resources between various crops. For instance, a field might devote all its energy to corn and produce 10 corn. Or it might devote itself solely to wheat and produce 20 wheat. Or it could produce 0.4*10 corn and (1 - 0.4)*20 wheat. > > Each good has a price. Fields always produce crops which gets them the most money. In the above example, if p_corn = $3 and p_wheat = $1, the field produces all corn. If p_corn changes to $1, it produces all wheat. If p_corn changes to $2, it can produce both corn and wheat in any legal proportion. > > So, C, the set of possible supplies, can be thought of as the set of optimal production possibilities (given a price). It is the boundary of a (non-strict) convex set. > > f is the "demand" side. Different agents own the fields and collect their revenue (i.e. price*quantity produced). They spend this revenue on the same goods that are being produced. f tells me how much total demand there is, given the quantities produced and prices. > > The goal is to find a point on C where the supply (q) is equal to the demand (a function of q and the tangent plane, i.e. the price, p). > > The number of fields is high (~10^6), the number of goods is smaller (~10^3). ================ Yes, but I'm still looking for the equation defining f(p,q), f(p,q)=.....? |
> f(p,q)=.....?
Need to set supply = demand, so want q - demand(q,p) = 0. There's no specific form for it--I'll want to change it as a parameter--but for example it be done as follows: Define I_n is agent n's income (i.e. price times quantity of all fields the agent owns). Then: demand_c(q,p) = sum(I_n * w_cn / p_c), i.e. each agent spends a certain percent, w_cn, of her income on good c. In my particular example, each agent demands a distinct set of goods (so the sum is unnecessary). |
"Cory" wrote in message <jnv41i$epg$1@newscl01ah.mathworks.com>...
> > f(p,q)=.....? > > Need to set supply = demand, so want > > q - demand(q,p) = 0. There's no specific form for it--I'll want to change it as a parameter--but for example it be done as follows: ============ OK. Well, I understand certain things now and I'll begin my remarks with those. So basically what I understand is that your decision variables are p, a vector of prices of different goods, c. If we denote nc as the number of different types of goods, then p is a vector in R^nc. For a fixed and given price vector, p, you solve a linear program over the polyhedral constraint set C of different possible q vectors, where you're trying to maximize either dot(q,p) or else some other linear function of p. I can't really tell whether q is a vector in R^nc as well or whether it's a vector of a different dimension. Is q a list of the total quantities of each good c? Or is it the quantity of each good in each field? Or the quantity of each good produced by each agent? In any case, the linear program solution leads to a vector q listing quantities of goods. This makes q a function of p, so I'll write this as q(p). Moreover q(p) will be a piece-wise constant function of p, because in general, the solutions of linear programs vary with the objective function data in a piece-wise constant manner. Now, you want to solve f(p,q(p))=0 where the output of f() is a vector in R^n, but I don't know what n represents (the total number of fields? the total number of agents owning collections of fields?) In any case, assuming f(a,b) is continuous, then f(p,q(p)) will be piece-wise continuous, but with jump discontinuities, since q(p) is piecewise constant. So to sum it all up, you're trying to find the roots of a vector-valued, piece-wise jump-discontinuous function of very large dimension. On top of that, you claim not to have an explicit form A*x<=b of your constraint set C. This sounds really intractable, not only for MATLAB, but for any other software tool you're likely to find. I suppose one possibility would be for you to go look at the Global Optimization Toolbox, because the least squares solution of f(p,q(p))=0 is going to be a non-smooth, discontinuous problem, probably with a number of different local solutions and the Global Opt Toolbox is the only thing that would have any hope of attacking such problems in dimensions >1. I think the more likely path of success, though, would be to try to see if your problem can be reformulated/simplified, and clearing up my residual questions below might help with that: 1. So again, what vector space does q live in? Is it R^nc or something else? Same question for f(). I guess that q and f(p,q) live in the same space R^n because you said that one possible form for f is f(p,q)=q-demand(p,q) and that's only possible if they are of the same dimension. But again, what is n? 2. You said that you cannot tractably express C as inequalities A*x<=b, and that you only know its vertices. But your verbal description makes the opposite seem true. You seem to be deriving the "vertices" from inequalities already available for each field. In particular you know that q_i(c), the fraction of goods produced by the i-th field satisfies things like q_i(c)>=0 sum_c q_i(c)=1 so the inequality data appear to be already available. Yes, the matrix A required to represent these may be large, but it will be sparse and MATLAB can deal with sparse matrices. 3. If you have agents owning multiple fields, is there any reason not to combine all of the fields owned by the agent and view them as one big field? It seems like it would just lead to a different, but simpler and lower-dimensional, space of supplies. > Define I_n is agent n's income (i.e. price times quantity of all fields the agent owns). Then: > > demand_c(q,p) = sum(I_n * w_cn / p_c), i.e. each agent spends a certain percent, w_cn, of her income on good c. ==== The summation is over c, right? Shouldn't this be sum(I_nc * w_cn ) where I_cn is the income earned by agent n on good c alone? > In my particular example, each agent demands a distinct set of goods (so the sum is unnecessary). =============== Why would the sum be unnecessary (I'm still assuming the sum is over c). If each agent demands a distinct "set" of goods, then why don't you have to sum over this set? |
No comments:
Post a Comment