Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Argument to have GramSchmidt return matrix having same dimension as input #48

Open
ggrothendieck opened this issue Mar 6, 2023 · 15 comments

Comments

@ggrothendieck
Copy link

ggrothendieck commented Mar 6, 2023

Currently GramSchmidt contains the line below which removes the zero columns

B <- B[, !apply(B, 2, function(b) all(b == 0))]

however, that makes it hard to relate the result to the input matrix. Suggest that there be an argument to skip over that line so that the output columns directly correspond to the input columns in the same position.

@john-d-fox
Copy link
Collaborator

john-d-fox commented Mar 7, 2023 via email

@friendly
Copy link
Owner

friendly commented Mar 7, 2023

I agree that optionally allowing to retain the 0 columns is desireable, and I think the suggestion to make the quoted line conditional would do that.

An argument could be made either way for what should be the default, but at least it would be documented.

Perhaps review the examples / vignettes to see how GramSchmidt() is currently used in the package?

@john-d-fox
Copy link
Collaborator

john-d-fox commented Mar 8, 2023 via email

@friendly
Copy link
Owner

friendly commented Mar 8, 2023

Accepted John's changes, now in master. I've bumped the pkg version to 0.8-7, but will wait to release this.

@ggrothendieck You can (after this settles down) install the github version to test this.

@ggrothendieck
Copy link
Author

ggrothendieck commented Mar 15, 2023

Seems to work when I tried it.

Since the output is now the same shape as the input one could retain the column names to make it easier to relate the output to the input -- actually it could be done even if zero columns were omitted.

transform(BOD, z = Time + demand, zz = rnorm(6)) |> 
  as.matrix() |>
  GramSchmidt(omit_zero_columns = FALSE)
##            [,1]        [,2] [,3]        [,4]
## [1,] 0.09805807  0.28210212    0 -0.02308851
## [2,] 0.19611614  0.24258721    0  0.59730121
## [3,] 0.29417420  0.54510946    0  0.39265953
## [4,] 0.39223227  0.25034293    0 -0.61734941
## [5,] 0.49029034 -0.70807780    0  0.27453213
## [6,] 0.68640647  0.01948748    0 -0.17896507

@john-d-fox
Copy link
Collaborator

Preserving the column names makes sense to me (and is easily done). Shall I make this change?

@friendly
Copy link
Owner

@ggrothendieck @john-d-fox
This change seems sensible, but I'd like to see an example of what omit_zero_columns = TRUE should/does look like.
Does this change still work if the matrix has row/col names?

@john-d-fox
Copy link
Collaborator

The names of the nonzero columns are retained. Yes, the change works if there are row/column names. Row names always worked, and retaining column names is the point.

Shall I commit the change? If you don’t like it, you can always revert it (though I doubt you’ll want to).

@ggrothendieck
Copy link
Author

The example shown in my earlier post using BOD had input column names but no column names appeared on the output.

@john-d-fox
Copy link
Collaborator

Right. I'm waiting to hear from Michael before committing the fix, which displays the column names if they're defined.

@john-d-fox
Copy link
Collaborator

FWIW, here's the code:

GramSchmidt <- function (X, normalize = TRUE, verbose = FALSE,
                         tol=sqrt(.Machine$double.eps), omit_zero_columns=TRUE) {
  B <- X
  B[, 2L:ncol(B)] <- 0
  if (verbose) {
    cat("\nInitial matrix:\n")
    printMatrix(X)
    cat("\nColumn 1: z1 = x1\n")
    printMatrix(B)
  }
  for (i in 2:ncol(X)) {
    res <- X[, i]
    for (j in 1:(i - 1)) res <- res - Proj(X[, i], B[, j])
    if (len(res) < tol) res <- 0
    B[, i] <- res
    if (verbose) {
      prj_char <- character(i - 1)
      for (j in 1:(i - 1)) prj_char[j] <- sprintf("Proj(x%i, z%i)",
                                                  i, j)
      cat(sprintf("\nColumn %i: z%i = x%i - %s\n", i, i,
                  i, paste0(prj_char, collapse = " - ")))
      printMatrix(B)
    }
  }
  zeros <- apply(B, 2, function(b) all(b == 0))
  if (omit_zero_columns && any(zeros)){
    B <- B[, !zeros]
    which_zeros <- (1:ncol(X))[zeros]
    message("linearly dependent column", if (length(which_zeros) > 1) "s" else "",
            " omitted: ", paste(which_zeros, collapse=", "))
  }
  if (normalize) {
    norm <- diag(1/len(B))
    colnames <- colnames(B)
    B <- B %*% norm
    colnames(B) <- colnames
    if (!omit_zero_columns && any(zeros)) B[, zeros] <- 0
    if (verbose) {
      cat("\nNormalized matrix: Z * inv(L) \n")
      printMatrix(B)
    }
  }
  if (verbose)
    return(invisible(B))
  B
}

@friendly
Copy link
Owner

Looks good. Go ahead and commit to master.

@john-d-fox
Copy link
Collaborator

That's apparently easier said than done. When I tried to push the change, I got

>>> /usr/bin/git push origin HEAD:refs/heads/master
remote: error: GH006: Protected branch update failed for refs/heads/master.        
remote: error: At least 1 approving review is required by reviewers with write access.        
To https://github.com/friendly/matlib.git
 ! [remote rejected] HEAD -> master (protected branch hook declined)
error: failed to push some refs to 'https://github.com/friendly/matlib.git'

I suggest that you take the code from my previous "comment" to update GramSchmidt() in the master branch since I apparently am not allowed to do it.

@friendly
Copy link
Owner

friendly commented Mar 16, 2023

Ugh! This is some new github feature regarding permission I don't quite understand. Evidently, new additions to master must be via a pull request from another branch, or a pull request to the master branch that someone else (me) has to approve. In any case, it has to be via a pull request.

I don't think I can just copy/paste your code in the comment into the GramSchmidt.R.

I created a new branch, devel-0.9-7. Can you do a pull from RStudio git, switch to the devel-0.9-7 branch, and commit your
update there? and then push to that branch... Then, either you or I can do the pull request thing. I'm sorry that this has become an extra burden.

>>> C:/Program Files/Git/bin/git.exe checkout -B devel-0.9.7
Switched to a new branch 'devel-0.9.7'
>>> C:/Program Files/Git/bin/git.exe push -u origin devel-0.9.7
remote: 
remote: Create a pull request for 'devel-0.9.7' on GitHub by visiting:        
remote:      https://github.com/friendly/matlib/pull/new/devel-0.9.7        
remote: 
branch 'devel-0.9.7' set up to track 'origin/devel-0.9.7'.
To github.com:friendly/matlib.git
 * [new branch]      devel-0.9.7 -> devel-0.9.7

@john-d-fox
Copy link
Collaborator

OK, I pushed the change to GramSchmidt() to the devel-0.9-7. Can you take it from here?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants