-
Notifications
You must be signed in to change notification settings - Fork 192
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
OCPBUGS-32776: Fix IBM Public Cloud DNS Provider Update Logic #1133
base: master
Are you sure you want to change the base?
OCPBUGS-32776: Fix IBM Public Cloud DNS Provider Update Logic #1133
Conversation
@gcs278: This pull request references Jira Issue OCPBUGS-32776, which is invalid:
Comment The bug has been updated to refer to the pull request using the external bug tracker. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
b387d07
to
b07d602
Compare
/jira refresh |
@gcs278: This pull request references Jira Issue OCPBUGS-32776, which is valid. The bug has been moved to the POST state. 3 validation(s) were run on this bug
Requesting review from QA contact: In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
@gcs278: This pull request references Jira Issue OCPBUGS-32776, which is valid. 3 validation(s) were run on this bug
Requesting review from QA contact: In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
I missed one detail of the bug, missing instructions in the |
a057f1a
to
0716651
Compare
@lihongan Added the missing scope change instructions for the PowerVS type. /unhold |
infra failures |
0716651
to
f51b1d3
Compare
Infra issues |
pre-merge tested on OpenStack and looks good now
will test on IBMCloud later |
pre-merge tested on IBMCloud and also looks good
|
/assign |
Progressing
Condition
f51b1d3
to
7a97b9a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me, thank you for the fix!
/retest |
4758a5b
to
ec2fcdf
Compare
Both IBM public and private cloud unit tests were missing unit test coverage. This update extends test coverage for the Delete and CreateOrUpdateRecord functions. This commit provides an important point of reference for future commits that may preturb the existing functionality. Both Test_createOrUpdateDNSRecord functions previously only tested the update logic. In order to test the create logic, `CreateDNSRecord` and `CreateResourceRecord` needed to be implemented in the public and private `fake_client.go` respectively. The new test cases required the ability to control the response and results of `ListAllDnsRecords`, which were previously hardcoded in both public and private IBM cloud unit tests. Both public and private unit tests were updated to use the new OutputResults field specified in the `ListAllDnsRecordsInputOutput` struct, allowing the new test cases to specify no result (indicating no existing DNS record) so we can trigger the create logic. Lastly, various test cases were added to cover untested scenarios, such as testing CNAME Records, mismatching targets, missing record types or IDs, handling nil results, etc.
32caad5
to
2223383
Compare
infra failures |
Removed WIP, sorry for the delay, I am open to reviews again @candita @Miciah @SzucsAti Updates since last review:
|
Keeping the hold while I wait to see what the results of adding a IBM Cloud job are openshift/release#56785 /hold |
/retest |
Now that openshift/release#56785 has merged, though it will most likely fail due to DNS Propagation issues (being tested in #1132), I'm curious to see if we see anything interesting: |
I don't think it's enabled, but just curious: |
/testwith openshift/cluster-ingress-operator/master/e2e-ibmcloud-operator #1164 |
2223383
to
396acf9
Compare
After introducing and running e2e-ibmcloud-operator on different PRs, I realized it was passing successfully without this fix. I was mistaken that our E2E would have caught this bug. However, it looks like adding a Now let's test to make sure everything works: |
The IBM Public Cloud DNS provider (`cis_provider.go`) had a bug in `createOrUpdateDNSRecord` where it checked for the existence of a DNS record by filtering both DNS name and target. If the target was updated (e.g., due to a load balancer recreation), the logic would not match the existing DNS record. As a result, the function would attempt to create a new record, but fail because a record with that name already existed, as multiple DNS records with the same name are not allowed in IBM Cloud DNS providers. The fix is to remove the filtering by target and rely solely on filtering by name, as the name is the only attribute that needs to be unique. Additionally, the IBM DNS logic doesn't work for multiple targets and this creates unexpected and problematic results. The logic has been refactored to only create using the first target and it warns the user when multiple targets are set. This change is low risk since the Ingress Operator will never create a DNSRecord with multiple targets in `desiredDNSRecord`. Modify TestScopeChange to check for `Available=True` after deleting the service to add test coverage for this issue.
396acf9
to
d93c10b
Compare
Forgot to push... |
/testwith openshift/cluster-ingress-operator/master/e2e-ibmcloud-operator openshift/cluster-ingress-operator/pull#1164 |
@gcs278,
|
/testwith openshift/cluster-ingress-operator/master/e2e-ibmcloud-operator https://github.com/openshift/cluster-ingress-operator#1164 |
multi-pr-openshift-cluster-ingress-operator-1133-openshift-cluster-ingress-operator-1164 passed which has E2E test coverage for this bug, proving it resolved the problem. |
/retest-required |
/retest |
@gcs278: The following tests failed, say
Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
if resourceRecord.ID == nil { | ||
return fmt.Errorf("delete: record id is nil") | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not check this earlier, like where we check the resourceRecord.Type?
@@ -190,6 +194,10 @@ func (p *Provider) createOrUpdateDNSRecord(record *iov1.DNSRecord, zone configv1 | |||
log.Info("Warning: TTL must be one of [1 60 120 300 600 900 1800 3600 7200 18000 43200]. RecordTTL set to default", "default DSNSVCS record TTL", defaultDNSSVCSRecordTTL) | |||
record.Spec.RecordTTL = defaultDNSSVCSRecordTTL | |||
} | |||
// We only support one target, warn the user. | |||
if len(record.Spec.Targets) > 1 { | |||
log.Info("Warning: Only one DNSRecord target is supported. Additional targets will be ignored.", "targets", record.Spec.Targets) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How did you decide that this should only be an Info warning? Shouldn't it be an Error to be more visible?
case iov1.CNAMERecordType: | ||
inputRData, err := p.dnsService.NewResourceRecordInputRdataRdataCnameRecord(target) | ||
|
||
// TODO DNS record update should handle the case where we have an A record and want a CNAME record or vice versa |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see this is a carried-over TODO, but should it be done now in order to fix the bug? Or is it out of scope?
@@ -152,6 +154,10 @@ func (p *Provider) createOrUpdateDNSRecord(record *iov1.DNSRecord, zone configv1 | |||
log.Info("Warning: TTL must be between 120 and 2,147,483,647 seconds, or 1 for Automatic. RecordTTL set to default", "default CIS record TTL", defaultCISRecordTTL) | |||
record.Spec.RecordTTL = defaultCISRecordTTL | |||
} | |||
// We only support one target, warn the user. | |||
if len(record.Spec.Targets) > 1 { | |||
log.Info("Warning: Only one DNSRecord target is supported. Additional targets will be ignored.", "targets", record.Spec.Targets) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might be more visible if it was logged as an Error.
} | ||
} | ||
if result == nil || result.Result == nil { | ||
return fmt.Errorf("createOrUpdateDNSRecord: ListAllDnsRecords returned nil as result") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happens for the very first record created? We can't find any to update, but we need to go on to create it instead of returning an error, don't we?
createDnsRecordInputOutput: dnsclient.CreateDnsRecordInputOutput{ | ||
InputId: "testCreate", | ||
OutputError: nil, | ||
OutputResponse: &core.DetailedResponse{StatusCode: http.StatusOK}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible to find a way to check that the list was updated?
OutputError: errors.New("error in CreateDnsRecord"), | ||
OutputResponse: &core.DetailedResponse{StatusCode: http.StatusRequestTimeout}, | ||
}, | ||
expectErrorContains: "error in CreateDnsRecord", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't really seem to be testing any error input if we're just matching the supplied error.
func (fdc FakeDnsClient) ListResourceRecords(listResourceRecordsOptions *dnssvcsv1.ListResourceRecordsOptions) (result *dnssvcsv1.ListResourceRecords, response *core.DetailedResponse, err error) { | ||
fakeListDnsrecordsResp := &dnssvcsv1.ListResourceRecords{} | ||
recordType := string(iov1.ARecordType) | ||
rData := map[string]interface{}{"ip": fdc.ListAllDnsRecordsInputOutput.RecordTarget} | ||
|
||
fakeListDnsrecordsResp.ResourceRecords = append(fakeListDnsrecordsResp.ResourceRecords, dnssvcsv1.ResourceRecord{ID: &fdc.ListAllDnsRecordsInputOutput.RecordName, Name: &fdc.ListAllDnsRecordsInputOutput.RecordName, Type: &recordType, Rdata: rData}) | ||
|
||
resp := &core.DetailedResponse{ | ||
StatusCode: fdc.ListAllDnsRecordsInputOutput.OutputStatusCode, | ||
Headers: map[string][]string{}, | ||
Result: result, | ||
RawResult: []byte{}, | ||
} | ||
|
||
return fakeListDnsrecordsResp, resp, fdc.ListAllDnsRecordsInputOutput.OutputError | ||
return fdc.ListAllDnsRecordsInputOutput.OutputResult, fdc.ListAllDnsRecordsInputOutput.OutputResponse, fdc.ListAllDnsRecordsInputOutput.OutputError |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
listResourceRecordsOptions
is unused. Also, can you add a comment about what is actually being returned here? We haven't changed anything.
"k8s.io/utils/pointer" | ||
|
||
"github.com/IBM/go-sdk-core/v5/core" | ||
"github.com/IBM/networking-go-sdk/dnssvcsv1" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment for imports not in order.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll review this further after learning more about my questions in cis_provider_test.go
The IBM Public Cloud DNS provider (
cis_provider.go
) had a bug increateOrUpdateDNSRecord
where it checked for the existence of a DNS record by filtering both DNS name and target. If the target was updated (e.g., due to a load balancer recreation), the logic would not match the existing DNS record. As a result, the function would attempt to create a new record, but fail because a record with that name already existed, as multiple DNS records with the same name are not allowed.The fix is to remove the filtering by target and rely solely on filtering by name, as the name is the only attribute that needs to be unique.
Additionally, the IBM DNS logic doesn't work for multiple targets and this creates unexpected and problematic results. The logic has been refactored to only create and delete using the first target. It warns the user when multiple targets are set.
This PR also includes some unit test fix up and missing unit test coverage for the IBM CIS Provider.
This resolves the same DNS issues for public PowerVS cloud as it uses the same logic.